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® Secretory expression in •ukaryatas. 

Methods and compositions are provided for producing 
polypeptide sequences in high yield by employing DNA 
constructs, wherein the DNA sequence encoding for the 
polypeptide of Interest is preceded by a leader sequence and 
processing sequence for secreting and processing said 
polypeptide. In this manner, the mature polypeptide of 
interest may be isolated from the nutrient medium substan- 
tially free of major amounts of other proteins and cellular 
debris. 

The yeast strain SL cerevisiae AB103 (pYEGF8) was 
deposited on January §, 1983, at the A.T.C.C and given 
accession No. 2065a 

The plasmid pYaEGF23 (pAB114-pC1/1) was deposited 
at the A.T.CC. on August 12, 1983, and given Accession No. 
40079. 
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9729-1-1/CCCCC2 
SECRETORY EXPRESSION IN EUKARYOTES 

BACKGROUND OF THE INVENTION 

Field of the Invention 

Hybrid DNA technology has revolutionized the 
ability, to produce polypeptides of an infinite variety 
of compositions. Since living forms are composed of 
proteins and employ proteins for regulation, the 
ability to duplicate these proteins at will offers 
unique opportunities for investigating the manner in 
which these proteins function and the use of such 
proteins, fragments of such proteins, or analogs in 
therapy and diagnosis. 

There have.^been numerous advances in improv- 
ing the rate and amount of protein produced by a cell. 
Host of these advances have been associated with higher 
copy numbers, more efficient promoters, and means for 
reducing the amount of degradation of the desired 
product. Is is evident that it would be extremely 
desirable to be able to secrete polypeptides of interest, 
where such polypeptides are the product of interest. 

Furthermore, in many situations, the polypep- 
tide of interest does not have an initial methionine 
amino acid. This is usually a result of there being a 
processing signal in the gene encoding for the polypep- 
tide of interest, which the gene source recognizes and 
cleaves with an appropriate peptidase. Since in most 
situations, genes of interest are heterologous to the 
host in which the gene is to be expressed, such proces- 
sing occurs imprecisely and in low yield in the expres- 
sion host. In this case, while the protein which is 
obtained will be identical to the peptide of interest 
for almost all of its sequence, it will differ at the 
N-.terminus which can deleteriously affect physiological 
activity. 
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There are, therefore, many reasons why it 
would be extremely advantageous to prepare DNA se- 
quences, which would encode for the secretion and 
maturing of the polypeptide product. Furthermore, 
5 where sequences can be found for processing, which 

result in the removal of amino acids superfluous to the 
polypeptide of interest, the opportunity exists for 
having a plurality of DNA sequences, either the same or 
different, in tandem, which may be encoded on a single 
10 transcript. 

Description of the Prior Art 

U.S. Patent No. 4,336,336 describes for pro- 
karyotes the use of a leader sequence coding for a non- 
cytoplasmic protein normally transported to or beyond 
15 the cell surface, resulting in transfer of the fused 
protein to the periplasmic space. U.S. Patent No. 
4,338,397 describes for prokaryotes using a leader 
sequence which provides for secretion with cleavage of 
the leader sequence from the polypeptide sequence of 
20 interest. U.S. Patent No. 4,338,397, columns 3 and 4, 
provide for useful definitions, which definitions are 
incorporated herein by reference. 

Kurjan and Herskowitz, Cell (1982) 30:933-943 
describes a putative a- factor precursor containing four 
25 tandem copies of mature ct-factor, describing the 
sequence and postulating a processing mechanism. 
Kurjan and Herskowitz, Abstracts of Papefcs presented at 
the 1981 Cold Spring Harbor meeting oh The Molecular 
Biology of Yeasts, page 242, in an Abstract entitled, 
30 "A Putative a-Factor Precursor Containing Four Tandem 
Repeats of Mature a-Factor, 11 describe the sequence 
encoding for the a- factor and spacers between two of 
such sequences. Blair et> al., Abstracts of Papers, 
ibid , page 243, in an Abstract entitled "Synthesis and 
35 Processing of Yeast Pheremones: Identification and 
Characterization of Mutants That Produce Altered a- 
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Factors," describe the effect of various mutants on 
the production of mature a -factor. 

SUMMARY OF THE INVENTION 
Methods and compositions are provided for 
producing mature polypeptides. DNA constructs are 
provided which join the DNA fragments encoding for a 
yeast leader sequence and processing signal to heterolo- 
gous genes for secretion and maturation of the poly- 
peptide product. The construct of the DNA encoding for 
the N-terminal cleavable oligopeptide and the DNA 
sequence encoding for the mature polypeptide product 
can be joined to appropriate vectors for introduction 
into yeast or other cell which recognizes the processing 
-signals for production of the desired polypeptide. 
Other capabilities may also be introduced into the 
construct for various purposes. 

BRIEF DESCRIPTION OF THE DRAWINGS 

' ' i ■■■ ■ ■ ' ■ 

Fig. 1 is a flow diagram indicating the 
construction of pYaEGF-21. 

Fig. 2 shows sequences at fusions of hEGF to 
the vector, a. through e. show the sequences at the 
N-terminal region of hEGF, which differ among several 
constructions and f . shows the C- terminal region of 
hEGF. 

DESCRIPTION OF THE SPECIFIC EMBODIMENTS 

In accordance with the subject invention, 
eukaryotic hosts, particularly yeast are employed for 
the production of mature polypeptides / where such 
polypeptides may be harvested from a nutrient medium. 
The polypeptides are produced by employing a DNA 
construct encoding for yeast leader and processing 
signals joined to a polypeptide of interest, which may 
be a single polypeptide or a plurality of polypeptides 
separated by processing signals. The resulting 
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construct encodes for a pre-pro-polypeptide which will • 
contain the signals for secretion of the pre-pro- 
polypeptide and processing of the polypeptide, either 
intracellularly or extracellularly to .the mature 

5 polypeptide . 

The constructs of the subject invention will for example 
have at least the following formula defining a pro- 
polypeptide: 

((R) r -(GAX?CX) Il -Gene*) y 

10 wherein: 

R is CGX or AZZ, the codons coding for lysine 
and arginine, each of the Rs being the same or different; 

r is an integer of from 2 to 4, usually 2 to 

3, preferably 2 or 4; 
15 x is any of the four nucleotides, T, G, C, or 

A; 

Y is 6 or C; 

y is an integer of at least one and usually 
not more than 10, more usually not more than four, 
20 providing for monomers and multimers; 

Z is A or G; and 

Gene* is a gene other than a -factor, usually 

foreign to a yeast host, usually a heterologous gene, 

desirably a plant or mammalian gene; 
25 n is 0 or an integer which will generally 

vary from 1 to 4, usually 2 to 3. 

The pro-polypeptide has an N-terminal proces- *. 

sing signal for peptidase removal of the amino acids 

preceding the amino acids encoded for by Gene*. 
30 For the most part, the constructs of the 

subject invention will have at least the following 

formula: 

L-(R-S-(GAXYOC) n )-Gene*) y 
defining a pre-pro-polypeptide, wherein all 
35 the symbols except L and S have been defined, S having 
the same definition as R, there being 1R and is, and L 
is a leader sequence providing for secretion of the 
pre-pro-polypeptide. While it is feasible to have more 
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Rs and Sb there will usually be no advantage in the 
additional amino acids. Any leader sequence may be 
employed which provides for secretion, leader sequences 
generally being of about 30 to 120 amino acids, usually 
about 30 to 100 amino acids, having a hydrophobic 
region and having a methionine at its N- terminus. - 

The construct when n is 0 will have the 
following formula: 

L-((R) r ,-Gene*) y 
defining a pre-pro-polypeptide, wherein all the symbols 
have been defined previously, except r 1 , wherein: 

r 1 is 2 to 4, preferably 2 or 4. 

Of particular interest is the leader sequence 
of o-f actor which is described in Kurjan and Hersko- 
=witz, supra , on page 937 or fragments or analogs 
thereof, which provide for efficient secretion of the 
desired polypeptides* Furthermore, the DNA sequence 
indicated in the article, which sequence is incorporated 
herein by reference, is not essential, any sequence 
which encodes for the desired oligopeptide being 
sufficient. Different sequences will he more or less 
efficiently translated. 

While the above formulas are preferred, it 
should be understood, that with suppressor mutants, 
other sequences could be provided which would result in 
the desired function. Normally, suppressor mutants are 
not as efficient for expression and, therefore, the 
above indicated sequence or equivalent sequence encoding 
for the same amino acid sequence is preferred. To the 
extent that a mutant will express from a different 
codon the same amino acids which are expressed by the 
above sequence, then such alternative sequence could be 
permitted. 

The dipeptides which are encoded for by the 
sequence in the parenthesis will be an acidic amino 
acid, aspartic or glutamic, preferably glutamic, 
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followed by a neutral amino acid, alanine and proline, 

particularly alanine* 

In providing for useful DNA sequences which 
r an be used for cassettes for expression, the following 
sequence can be conveniently employed: 

Tr-L-( (R-S ) r „-(GAXycX) n) -W- (Gene* ) d ) y 

wherein : 

Tr intends a DNA sequence encoding for the 
transcriptional regulatory signals, particularly the 
promoter and such other regulatory signals as operators, 
activators, cap signal, signals enhancing ribosomal 
binding, or other sequence involved with transcriptional 
or translational control. The Tr sequence will generally 
be at least about lOObp and not more than about 2000bp. 
Particularly useful is employing the Tr sequence 
associated with the leader sequence L, so that a DNA 
fragment can be employed which includes the transcrip- 
tional and translational signal sequences associated 
with the leader sequence endogenous to the host* 
Alternatively, one may employ other transcriptional and 
translational signals to provide for enhanced production 
of the expression product; 

d is 0 or 1, being 1 when y is greater than 

1; 

n 1 is a whole number, generally ranging from 
0 to 3, more usually being 0 or 2 to 3; 

r° is 1 or 2; 

W intends a terminal deoxyribosyl-3 ' group, 
or a DNA sequence which by itself or, when n 1 is other 
than 0, in combination with the nucleotides to which it 
is joined, W defines a restriction site, having either 
a cohesive end or butt end, wherein W may have from 0 
to about 20 nucleotides in the longest chain; 

the remaining symbols having been defined 

previously. 

Of particular interest is the following 

construct: 
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(Tr ) -L- (R-S )„„ - ( gaxyoh „ga ! Sgct ! 

wherein: 

all of the symbols previously defined have 
the same definition; 
5 a is 0 or 1 intending that the construct may 

or may not have the transcriptional and translational 
signals; 

the nucleotides indicated in the broken box 
are intended not to be present but to be capable of 
10 addition by adding an Hind i 1 1 cleaved terminus to 

provide for the recreation of the sequence encoding for 
a dipeptide; and 

n" will be 0 to 2, where at least one of the 
Xs and Ys defines a nucleotide, so that the sequence in 
15 the. .parenthesis is other than the sequence GAAGCT. 

The coding sequence of Gene* may be joined to 
the terminal T, providing that the coding sequence is 
in frame with the initiation codon and upon processing 
the first amino acid will be the correct amino acid for 
20 the mature polypeptide. 

The 3 1 -terminus of Gene* can be manipulated 
much more easily and, therefore, it is desirable to 
provide a construct which allows for insertion of Gene* 
into a unique restriction site in the construct. Such 
25 a construct would provide for a restriction site with 
insertion of the Gene* into the restriction site to be 
in frame with the initiation codon. Such a construction 
can be symbolized as follows: 

(Tr) a -L-(R-S) r „-( GAXYCX ) n ti-W-(SC ) fa -Te 

30 wherein: 

those symbols previously defined have the 
same definition; 

SC are stop codons; 

Te is a termination sequence balanced with 
35 the promoter Tr, and may include other signals, e.g. 
polyadenylation; and 
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b is an integer which will generally vary 
from about 0 to 4, more usually from 0 to 3, it being 
understood, that Gene* may include its own stop codons. 

Illustrative of a sequence having the above 
5 formula is where W is the sequence GA and n" is 2. 

Of particular interest is where the sequence 

- encoding the terminal dipeptide is taken together with 

W to define a linker or connector, which allows for 
recreation of the terminal sequence def inin g the 
10 dipeptide of the processing signal and encodes for the 
initial amino acids of Gene*, so that the codons are in 
frame with the initiation codon of the leader. The 
linker provides for a staggered or butt ended termina- 
tion, desirably defining a restriction site in conjunc- 
15 tion with the 'Successive sequences of the Gene*. Upon 
ligation of the linker with Gene*, the codons of Gene* 
will be in frame with the initiation codon of the 
leader. In this manner, one can employ a synthetic 
sequence which may be joined to a restriction site in 
20 the processing signal sequence to recreate the proces- 
sing signal, while providing the initial bases of the 
Gene* encoding for the N- terminal amino acids. By 
employing a synthetic sequence, the synthetic linker 
can be a tailored connector having a convenient restric- 
25 tion site near the 3 1 -terminus and the synthetic 

connector will then provide for the necessary codons 
for the 5 1 -terminus of the gene. 

Alternatively, one could introduce a restric- 
tion endonuclease recognition site downstream from the 
30 processing signal to allow for cleavage, and removal of 
superfluous bases to provide for ligation of the Gene* 
to the processing signal in frame with the initiation 
codon. Thus the first codon would encode for the 
N- terminal amino acid of the polypeptide. Where T is 
35 the first base of Gene*, one could introduce a restric- 
tion site where the recognition sequence is downstream 
from the cleavage site. For example, a Sau3 A recogni- 
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tion sequence could be introduced immediately after the 
processing signal, which would allow for cleavage and 
linking of the Gene* with its initial codon in frame 
with the leader initiation codon* With restriction 
5 endonucleases which have the recognition sequence 

distal and downstream from the cleavage site e.g. Hgal, 
W could define such sequence which could include a 
portion of the processing signal sequences. Other 
constructions can also be employed, employing such 
10 techniques as primer repair and in vitro mutagenesis to 
provide for the convenient insertion of Gene* into the 
construct by introducing an appropriate restriction 
site . 

The construct provides a portable sequence 

15 for insertion into vectors, which provide the desired 
replication system. As already indicated, in some 
instances, it may be desirable to replace the wild type 
promoter associated with the leader sequence with a 
different promoter. In yeast, promoters involved with 

20 enzymes in the glycolytic pathway can provide for high 
rates of transcription. These promoters are associated 
with such enzymes as phosphoglucoisomerase, phos- 
phofructokinase, phosphotriose isomer ase, phospho- 
glucomutase, enolase, pyruvic kinase, glyceraldehyde-3- 

25 phosphate dehydrogenase, and alcohol dehydrogenase. 
These promoters may be inserted upstream from the 
leader sequence. The 5 '-flanking region to the leader 
sequence may be retained or replaced with the 3'- 
sequence of the alternative promoter. Vectors can be 

30 prepared and have been reported which include promoters 
having convenient restriction sites downstream from the 
promoter for insertion of such constructs as described 
above. 

The final construct will be an episomal 
35 element capable of stable maintenance in a host, 

particularly a fungal host such as yeast. The construct 
will include one or more replication systems, desirably 
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two replication systems, allowing for maintenance in 
the expression host and cloning in a prokaryote. In 
addition, one or more markers for selection will be 
included, which will allow for selective pressure for 
5 maintenance of the episomal element in the host. 

Furthermore, the episomal element may be a high or low 
copy number, the copy number generally ranging from 
about 1 to 200. With high copy number episomal elements, 
there will generally be at least 10, preferably at 
10 least 20, and usually not exceeding about 150, more 

usually not exceeding about 100 copy number. Depending 
upon the Gene*, either high or low copy numbers may be 
desirable, depending upon the effect of the episomal 
element on the host. Where the presence of the expres- 
15 sion product of the episomal element may have a dele- 
terious effect on the viability of the host, a low copy 
number may be indicated. 

Various hosts may be employed, particularly 
mutants having desired properties. It should be 
20 appreciated that depending upon the rate of production 
of the expression product of the construct, the pro- 
cessing enzyme may or may not be adequate for process- 
ing at that level of production. Therefore, a mutant 
having enhanced production of the processing enzyme may 
25 be indicated or enhanced production of the enzyme may 

be provided by means of an episomal element. Generally, 
the production of the enzyme should be of a lower order 
than the production of the desired expression product. 

Where one is using a- factor for secretion and 
30 processing, it would be appropriate to provide for 

enhanced production of the processing enzyme Dipeptidyl 
Amino Peptidase A, which appears to be the expression 
product of STE13. This enzyme appears to be specific 
for X-Ala- and X-Pro-sequences , where X in this instance 
35 intends an amino acid, particularly, the dicarboxylic 
acid amino acids. 
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Alternatively, there may be situations where 
intracellular processing is not desired. In this 
situation, it would be useful to have a ste!3 mutant, 
where secretion occurs, but the product is not pro- 
cessed. In this manner, the product may be subse- 
guentally processed in vitro * 

Host mutants which provide for controlled 
regulation of expression may be employed to advantage. 
For example, with the constructions of the subject 
invention where a fused protein i£ expressed, the 
trans formants have slow growth which appears to be a 
result of toxicity of the fused protein. Thus, by 
inhibiting expression during growth, the host may be 
grown to high density before changing the conditions to 
permissive conditions for expression. 

A temperature-sensitive sir mutant may be 
employed to achieve regulated expression. Mutation in 
any of the SIR genes results in a non-mating phenotype 
due to in situ expression of the normally silent MATa 
and MATa sequences present at the HML and HMR loci. 

Furthermore, as already indicated, the Gene* 
may have a plurality of sequences in tandem, either the 
same or different sequences, with intervening processing 
signals. In this manner, the product may be processed 
in whole or in part, with the result that one will 
obtain the various sequences either by themselves or in 
tandem for subsequent processing. In many situations, 
it may be desirable to provide for different sequences, 
where each of the sequences is a subunit of a particular 
protein product. 

The Gene* may encode for any type of polypep- 
tide of interest. The polypeptide may be as small as 
an oligopeptide of 8 amino acids or may be 100,000 
daltons or higher. Usually, single chains will be less 
than about 300,000 daltons, more usually less than 
about 150,000 daltons. Of particular interest are 
polypeptides of from about 5,000 to 150,000 daltons, 
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more particularly of about 5,000 to 100,000 daltons. 
Illustrative polypeptides of interest include hormones 
and factors r such as growth hormone, somatomedins 
epidermal growth factor, the endocrine secretions, such 
5 as luteinizing hormone, thyroid stimulating hormone, 
oxytocin, insulin, vasopressin, renin, calcitonin, 
follicle stimulating hormone, prolactin, etc.; hemato- 
poietic factors, e.g. erythropoietin, colony stimulating 
factor, etc.? lymphokines ; globins; globulins, e.g. 
10 immunoglobulins; albumins; interferons, such as or, p 

and y; repressors; enzymes; endorp hin s e.g. p -endorphin, 
enkephalin, dynorphin, etc. 

Having prepared the episomal elements con- 
taining the constructs of this invention, one may then 
15 introduce such element into an appropriate host. The 
manner of introduction is conventional, there being a 
wide variety of ways to introduce DNA into a host. 
Conveniently, spheroplasts are prepared employing the 
procedure of, for example, Hi nnen et al . , PNAS USA 
20 (1978) 75:1919-1933 or Stinchcomb et al-, EP 0 045 573 
A2. The trans formants may then be grown in an appro- 
priate nutrient medium and where appropriate, maint aining 
selective pressure on the trans formants. Where expres- 
sion is inducible, one can allow for growth of the 
25 yeast to high density and then induce expression. In 
those situations, where a substantial proportion of the 
* product may be retained in the periplasmic space, one 
can release the product by treating the yeast cells 
with an enzyme such as zymolase or lyticase. 
30 The product may be harvested by any conve- 

nient means, purifying the protein by chromatography, 
electrophoresis, dialysis, solvent-solvent extraction, 
etc. 

In accordance with the subject invention, one 
35 can provide for secretion of a wide variety of polypep- 
tides, so as to greatly enhance product yield, simplify 
purification, minimize degradation of the desired 
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product, and simplify processing, equipment, and 
engineering requirements. Furthermore, utilization of 
nutrients based on productivity can be greatly enhanced, 
so that more economical and more efficient production 
of polypeptides may be achieved. Also, the use of 
yeast has many advantages in avoiding enter otoxins , 
which may be present with prokaryotes, and employing 
known techniques, which have been developed for yeast 
over long periods of time, which techniques include 
isolation of yeast products • 

The following examples are offered by way of 
illustration and not by way of limitation. 

EXPERIMENTAL 

A synthetic sequence for human epidermal 
growth factor (EGF) based on the amino acid sequence of 
EGF reported by H. Gregory and B.M. Preston Int. J. 
Peptide Protein Res. 9, 107-118 (1977) was prepared, 
which had the following sequence. 

5' MCTCCGACTCCGMTGTCCATTGTCCCACGACGGTTACTGTTTGCACGACGGTGTTTCT 
3 1 TTGAGGCTGAGGCTTACAGGTAACAGGGTGCTGCCAATGACAAA^ 

ATGTA(^TCGAAGCTTTGGACAAGTACG(nTGTMCTGTGTTGTTGGTTACATCGGTGAA 
TA(^TGTAGCTTCGAAACCTGTTCATGCGMCATTGACAC 

AGATGTCAATACAGAGACTTGAAGTGGTGGGAATTGAGATGA , 
TCTACAGTTATGTCTCTGAACTTCACCACCCTTAACTCTACT , 

where 5 1 indicates the promoter proximal end of the 
sequence. The sequence was inserted into the EcoR I 
site of p8R328 to produce a plasmid p328EGF-l and 
cloned. 

Approximately 30pg of p328EGF-l was digested 
with EcoR I and approximately lpg of the expected 190 
base pair EcoR I fragment was isolated. This was 
followed by digestion with the restriction enzyme Hgal. 
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Two synthetic oligonucleotide connectors Hindlll-Hgal 
and Hgal -Sal I were then ligated to the 159 base pair 
Hgal fragment. The Hgal-Hindlll linker had the following 
sequence: 

AGCTGAAGCT 

CTTCGATTGAG 

This linker restores the a- factor processing signals 
interrupted by the Hindi 1 1 digestion and joins the Hgal 
end at the 5 1 -end of the EGF gene to the Hindi 1 1 end 
of pAB112. 

The Hgal -Sal I linker had the following 

sequence: 

TGAGATGATAAG 

ACTATTCAGCT 

This linker has two stop codons and joins the Hgal end 
at the 3 '-end of the EGF gene to the Sail end of 
pAB112 . 

The resulting 181 base pair fragment was 
purified by preparative gel electrophoresis and ligated 
to lOOng of pAB112 which had been previously completely 
digested with the enzymes Hindi 1 1 and Sai l, Surprisingly, 
a deletion occurred where the codon for the 3rd and 4th 
aurinn acids of EGF, asp and ser r were deleted, with the 
remainder of the EGF being retained. 

pAB112 is a plasmid containing a 1.75kb EcoRI 
fragment with the yeast a- factor gene cloned in the 
EcoRI site of pBR322 in which the Hindi 1 1 and Sail 
sites had been deleted (pABll). pAB112 was derived from 
plasmid pABlOl which contains the yeast cr-factor gene 
30 as a partial Sau3A fragment cloned in the BamHI site of 
plasmid YEp24, pABlOl was obtained by screening a 
yeast genomic library in YEp24 using a synthetic 20-mer 
oligonucleotide probe (3 1 -GGCCGGTTGGTTACATGATT-5 T ) 
homologous to the published a-factor coding region 
35 (Kurjan and Herskowitz, Abstracts 1981 Cold Spring 
Harbor meeting on the Molecular Biology of Yeasts, 
page 242 ) . 
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The resulting mixture was used to transform 
E. coli HB101 cells and plasmid pAB201 obtained. 
Plasmid pAB201 (5pg) was digested to completion with 
the enzyme EcoR I and the resulting fragments were: 
5 a) filled in with DMA polymerase I Klenow fragment; 

b) ligated to an excess of BamH I linkers; and 

c) digested with BamHI . the 1.75kbp EcoRI fragment was 
isolated by preparative gel electrophoresis and 
approximately lOOng of the fragment was ligated to 

10 lOOng of pCl/1, which had been previously digested to 
completion with the restriction enzyme BamHI and 
treated with alkaline phosphatase . 

Plasmid pCl/1 is a derivative of pJDB219, 
Beggs, Nature (1978) 275:104, in which the region 

15 corresponding to bacterial plasmid pMB9 in pJDB219 has 
been replaced by pBR322 in pCl/1. This mixture was 
used to transform E. coli HB101 cells. Trans f ormants 
were selected by ampicillin resistance and their 
plasmids analyzed by restriction endonucleases . DNA 

20 from one selected clone (pYEGF-8) was prepared and used 
to transform yeast AB103 cells. Transf ormants were 
selected by their leu phenotype. 

Fifty milliliter cultures of yeast strain 
AB103 (a, pep 4-3, leu 2-3, leu 2- 112 , ura 3-52 , his 

25 4- 580 ) transformed with plasmid pYEGF-8 (deposited at 
the American Type Culture Collection on 5th January 
1983 and given ATCC Accession no. 20658) were grown at 
30° in -leu medium to saturation (optical density at 
600nm of 5) and left shaking at 30° for an additional 
30 12 hr period. Cell supematants were collected by 

centrifugation and analyzed for the presence of human 
EGF using the fibroblast receptor competition binding 
assay. The assay of EGF is based on the ability of 

125 

both mouse and human EGF to compete with I-labeled 
35* mouse EGF for binding sites on human foreskin fibro- 
blasts. Standard curves can be obtained by measuring 
the effects of increasing quantities of EGF on the bind- 

125 

ing of a standard amount of I-labeled mouse EGF. 
Under these conditions 2 to 20 ng of EGF are readily 
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125 

measurable. Details on the binding of I -labeled 
epidermal growth factor to human fibroblasts have been 
described by Carpenter et al., J* Biol - Chem . 250 , 4297 
(1975). Using this assay it is found that the culture 
5 medium contains 7±lmg of human EGF per liter. 

For further characterization, human EGF 
present in the supernatant was purified by absorption 
to the ion-exchange resin Bi or ex- 70 and elution with 
HC1 lOmH in 80% ethanol. After evaporation of the ECl 

10 and ethanol the EGF was solubilized in water. This 
material migrates as a single major protein of MW 
approx. 6,000 in 17.5% SDS gels, roughly the same as 
authentic mouse EGF (Mtf~6,000). This indicates that 
the a -factor leader sequence has been properly excised 

15 during the secretion process. Analysis by high resolu- 
tion liquid chromatography (microbondapak C18, Waters 
column) indicates that the product migrates with a 
retention time similar to an authentic mouse EGF 
standard. However, protein sequencing by Edman degrada- 

20 tion showed that the N- terminus retained the glu-ala 
sequence. 

A number of other constructions were prepared 
using different constructions for joining hEGF to the 
a -factor secretory leader sequence, providing for 

25 different processing signals and site mutagenesis. In 
Fig. 2 a. through e. show the sequence of the fusions at 
the N- terminal region of hEGF, which sequence differ 
among several constructions, f. shows the sequences at 
the C- terminal region of hEGF, which is the same for all 

30 constructions. Synthetic oligonucleotide linkers used 
in these constructions are boxed. 

These fusions were made as follows. Construc- 
tion (a) was made as described above. Construction (b) 
was made in a similar way except that linker 2 was used 

35 instead of linker 1. Linker 2 modifies the a- factor 

processing signal by inserting an additional processing 
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site (ser-leu-asp-lys-arg) immediately preceding the 
hEGF gene. The resulting yeast plasmid is named 
pYcrEGF-22. Construction (c), in which the dipeptidyl 
aminopeptidase maturation site (glu-ala) has been removed, 
5 was obtained by in vitro mutagenesis of construction 
(a). A PstI -Sal I fragment containing the a- factor 
leader-hEGF fusion was cloned in phage M13 and isolated 
in a single-stranded form. A synthetic 31-mer of 
sequence 5 1 -TCTTTGGATAAAAGAAACTCCGACTCCCG-3 1 was 

10 synthesized and 70 picomoles were used as a primer for 
the synthesis of the second strand from 1 picomole of 
the above template by the Klenow fragment of DNA 
polymerase. After fill-in and ligation at 14° for 18 
hrs, the mixture was treated with nuclease (5 units 

15 for 15 min) and used to trans feet £. coli JM101 cells. 
Bacteriophage containing DNA sequences in which the 
region coding for (glu-ala) was removed were located by 

32 

filter plaque hybridization using the P-labeled 
primer as probe. RF DNA from positive plaques was 

20 isolated, digested with Pst I and Sai l and the resulting 
fragment inserted in pAB114 which had been previously 
digested to completion with Sai l and partially with 
Pst I and treated with alkaline phosphatase. 

The plasmid pAB114 was derived as follows: 

25 plasmid pAB112 was digested to completion with Hind i 1 1 
and then religated at low (4pg/ml) DNA concentration 
.. and plasmid pAB113 was obtained in which three 63bp 
Hind i I I fragments have been deleted from the a- factor 
structural gene, leaving only a single copy of mature 

30 a- factor coding region. A BamHI site was added to 

plasmid pABll by cleavage with EcoR l , filling in of the 
overhanging ends by the Klenow fragment of DNA 
polymerase, ligation of BamH I linkers, cleavage with 
BamHI and religation to obtain pAB12. Plasmid pAB113 

35 was digested with EcoR I , the overhanging ends filled 

in, and ligated to BamH I linkers. After digestion with 
BamH I the 1500bp fragment was gel-purified and ligated 
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to pAB12 which had been digested with BamHI and treated 
with alkaline phosphatase. Plasmid pAB114, which 
contains a ISOObp BamH I fragment carrying the a -factor 
gene, was obtained. The resulting plasmid (pAB114 
5 containing the above described construct) is then 
digested with BamH I and ligated into plasmid pCl/1* 

* 

1 The resulting yeast plasmid is named pYrtBGF-23 

and was deposited at the American Type Culture Collection 
on 12th August 1983 under ATCC Accession no. 40079. 
10 Construction (d) r in which a new Kpn l site 

was generated, was made as described for construction 
(c) except that the 36-mer oligonucleotide primer of 
sequence 5 1 -GGGTACCTTTGGATAAAAGAAACTCCGACTCCGAAT-3 1 was 
used. The resulting yeast plasmid is named pYaEGF-24. 
15 Construction (e) was derived by digestion of the 

plasmid containing construction (d) with Kpnl and' Sail 
instead of linker 1 and 2. The resulting yeast plasmid 
is named pYctEGF-25. 

Yeast cells transformed with pYcrEGF-22 were 

20 grown in 15 ml cultures. At the indicated densities or 
times, cultures were centrifuged and the supematants 
saved and kept on ice. The cell pellets were washed in 
lysis buffer (0.1 Triton X-100, lOmM NaHPO^ pH 7.5) and 
broken by vortexing (5min in lmin intervals with 

25 cooling on ice in between) in one volume of lysis 

buffer and one volume of glass beads. After centrifuga- 
tion, the supematants were collected and kept on ice. 
The amount of hEGF in the culture medium and cell 
extracts was measured using the fibroblast receptor 

30 binding competition assay. Standard curves were 

obtained by measuring the effects of increasing quan- 
tities of mouse EGF on the binding of a standard amount 

155 

I-labeled mouse EGF. 

Proteins were concentrated from the culture 
35 media by absorption on Bio-Rex 70 resin and elution 
with 0.01 HC1 in 80% ethanol and purified by high 
performance liquid chromatography (HPLC) on a reverse 
phase C18 column. The column was eluted at a flow rate 
of 4ml/min with a linear gradient of 5% to 80% aceto- 
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nitrile containing 0.2% trifluoroacetic acid in 60min. 
Proteins (200-800 picomoles) were sequenced at the 
amino-terminal end by the Edman degradation method 
. using a gas-phase protein sequencer Applied Biosystems 
5 model 47 OA. The normal PROTFA program was used for all 
the analyses. Dithiothreitol was added to S2 (ethyl 
acetate: 20mg/liter) and S3 (butyl chloride: lOmg/liter) 
immediately before use. All samples were treated with 
IN HC1 in methanol at 40° for 15min to convert PTH- 
10 aspartic acid and PTH-glutamic acid to their methyl 
esters* All PTH- amino acid identifications were 
performed by reference to retention times on a IBM CN 
HPLC column using a known mixture of PTH- amino acids as 
standards . 

15 Secretion from pYaEGF-22 gave a 4:1 mole 

ratio of native N-terminus hEGF to glu-ala terminated 
hEGF, while secretion from pYcrEGF-23-25 gave only 
native N- terminated hEGF. Yields of hEGF ranged from 5 
to 8Mg/ml measured either as protein or in a receptor 

20 binding assay. 

The strain JRY188 ( MAT sir3-8 leu2-3 leu2-112 
trpl ura3 his4 rme ) was transformed with pYoEGF-21 and 
leucine prototrophs selected at 37°. Saturated 
cultures were then diluted 1/100 in fresh medium and 

25 grown in leucine selective medium at permissive (24°) 
and non-permissive (36°) temperatures and culture 
supernatants were assayed for the presence of hEGF as 
described above. The results are shown in the 
following table. 
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Regulated synthesis and secretion of hEGF in transformed 
yeast sir3 temperature-sensitive mutants. 





O.D.650 hEGF(uq/ml) 


36° 3a 


3.5 0.010 




5.4 0.026 




O . O w . w*» w 




6.4 0.024 


• 

24° 3a 


0.4 34 




1.3 145 




2.1 1075 




4.0 3250 


3b 


0.4 32 




1.4 210 




2.2 1935 




4.2 4600 


These results indicate 


that the hybrid 


a - f actor/EGF gene is being expre 


ssed under mating type 


regulation, even though it is pr 


esent on a high copy 


number plasmid. 




In accordance with the 


: subject invention. 


novel constructs are provided which may be inserted 


into vectors to provide for expr 


ession of polypeptides 


having an N- terminal leader segu 


Lence and one or more 


processing signals to provide fc 


>r secretion of the 


polypeptide as well as processir 


ig to result in a mature 


polypeptide product free of superfluous amino acids. 


Thus, one can obtain a polypeptide having the identical 


sequence to a naturally occurring polypeptide. In 


addition, because the polypeptide can be produced in 


yeast, glycosylation can occur, 


so that products can be 


obtained which are identical to the naturally occurring 



0116201 



21 

products. Furthermore, because the product is secreted, 
greatly enhanced yields can be obtained based on cell 
population and processing and purification are greatly 
simplified. In addition, employing mutant hosts, 
5 expression can be regulated to be turned off or on, as 
desired. 

Although the foregoing invention has been 
described in some detail by way of illustration and 
example for purposes of clarity of understanding , it 
10 will be obvious that certain changes and modifications 
may be practiced within the scope of the appended 
claims. 
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CLAIMS 

1- A DNA construct encoding a pre-pro-poly- 
peptide, said DNA construct encoding pre-pro-polypeptide 
comprising a yeast leader sequence, processing signals 
5 for processing the pre-pro-polypeptide to a mature poly- 
peptide and a gene encoding a polypeptide other than the 
wild type gene associated with said leader sequence. 

2. A DNA construct according to Claim 1, 
including at the 5 1 end of the sequence a yeast pro- 

10 moter and wherein said gene is heterologous to said 
yeast host. 

3. A DNA construct according to Claim 2, 
wherein said yeast promoter is the a-f actor promoter 
and said yeast leader is a leader sequence encoding for 

15 at least a major portion of the o -factor leader and is 
capable of providing for secretion. 

4. A DNA construct according to Claim 2, 
wherein said gene is a mammalian gene. 

5. A DNA construct comprising a sequence of 

20 the following formula: 

L-((R) r -(GAXTCX) n -Gene*) y 

wherein: 

L is a leader sequence recognized by yeast 
for secretion; 

25 R is a codon coding for arginine or lysine; 

r is an integer of from 2 to 4; 

X is any nucleotide; 

Y is guano sine or cytosine; 

y is an integer of from about 1 to 10; 
30 Gene* is a gene foreign to yeast; and 

n is 0 or 1 to 4. 
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6. A DNA construct according to Claim 5, 
wherein n is 0 to 4 and the nucleotides of said Gene* 
proximal to R at least in part define a recognition 
site for a restriction endonuclease . 

5 7. A DNA construct according to Claim 6, 

wherein said leader sequence is the a- factor leader 
sequence . 

8. A DNA construct according to Claim 7, 
wherein n is 0. 

10 9. A DNA construct of the formula: 

Tr-L- (R-S-( GAXYCX ) n< -W-( Gene * ) d ) y 

wherein: 

Tr is a sequence having transcriptional and 
translational regulatory signals for initiation and 
15 processing of transcription and translation, wherein 
said regulatory signals are recognized by yeast; 

L is a leader sequence for secretion by 

yeast; 

R and S are codons expressing arginine and 

20 lysine; 

X is any nucleotide; 

Y is cytosine or guanosine; 

y is an integer of from 1 to 4; 

n 1 is a whole number of from 0 to 4; 
25 W is a deoxyribosyl-3 1 group or when n' ie 

other than 0, one or more nucleotides which by themselves 
or together with the hexanucleotide in the parenthesis 
define a restriction site; 

Gene* either by itself or taken together with 
30 W defines a polypeptide sequence foreign to yeast; and 

d is 0 or 1, being 1, when y is greater than 

1. 
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10. A DNA construct according to Claim 9, 
wherein Tr is a sequence defining the regulatory 
signals for a-f actor, d is 1 and Gene* and W are taken 
together to define a polypeptide foreign to yeast. 

5 11. A DNA construct according to Claim 9 

wherein n 1 is 0. 

12. A DNA construct according to Claim 11, 
wherein said polypeptide product is a mammalian poly- 
peptide . 



10 



13 . A DNA construct comprising a sequence of 

the formula: 

" (Tr) -L-R- S- ( GAXYCX) M GA J AGCT ' 
a ^* 

Tr is a sequence defining transcriptional and 



15 

processing of transcription and translation recognized 
by yeast; 

a is 0 or 1; 

L is a leader sequence recognized by yeast; 
20 R and S are codons encoding for lysine and 

arginine; 

X is any nucleotide ; 

Y is cytosine or guanos ine; 

n" is 2 to 4; 

25 the nucleotides in the broken box indicate 

the nucleotides which are complementary to the overhang 
of the non-coding chain to define a Hindlll restriction 
site. 
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14. A DNA construct according to Claim 13, 
where Tr is a sequence defining the transcriptional and 
translational regulatory signals of a -factor. 

15. An expression episomal element compris- 
5 ing a replication system for providing stable mainte- 
nance in yeast and a sequence of the formula: 

Tr-L- ( R ) t - ( GAXYCX ) f -W-Te 

wherein; 

Tr is a sequence defining transcriptional and 
10 translational regulatory signals for initiation and 
processing of transcription and translation in yeast; 

L is a leader sequence recognized by yeast 
for secretion; 

R is a codon defining arginine or lysine; 
15 r 1 is a whole number in the range of 2 to 4; 

X is any nucleotide; 
Y is cytosine or guano sine; 
n 1 is a whole number in the range of 0 to 4; 
W is a nucleotide sequence of at least 1 
20 nucleotide, which by itself or when n 1 is other than 0, 
in conjunction with nucleotides in the parenthesis 



Te is a sequence defining a terminator 
balanced with said transcriptional initiator sequence. 

25 v . 16. An expression episomal element according 
to Claim 15 wherein Tr is derived from cr-factor and n 1 
is 2 to 3. 

17. An expression episomal element according 
to Claim 14, wherein Tr is derived from a- factor and n 1 
30 is 0. 



18. An episomal expression vector according 
to Claim 17, having a gene foreign to yeast intermediate 
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R and Te and in reading frame with the initiation codon 
of L 

19. An episomal expression element according 
to Claim 18, wherein said gene is a mammalian gene. 

20. An episomal element according to Claim 
19, wherein said mammalian gene is human epidermal 
growth factor. 

21. An episomal expression vector according 
to Claim 16, having a gene foreign to yeast intermediate 
the nucleotides in the parentheses and Te and in 
reading frame with the initiation codon of L. 

22. A method for producing a polypeptide 
foreign to yeast and having such polypeptide secreted 
into the culture medium, said method comprising: 

growing yeast containing an episomal expres- 
sion elements according to Claim 16, whereby the 
encoding sequences are expressed to produce a pre-pro- 
polypeptider and 

said pre-pro-polypeptide is at least partially 
processed and secreted. 

23. An episomal expression vector according 
to Claim 17,' having a gene foreign to yeast intermediate 
the nucleotides in the parentheses and Te and in 
reading frame with the initiation codon of L. 

24. A method for producing a polypeptide 
foreign to yeast and having such polypeptide secreted 
into the culture medium, said method comprising: 

growing yeast mutants containing an episomal 
expression element according to Claim 16, wherein said 
mutant permits external regulation of expression, 
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whereby the encoding sequences are expressed to produce 
a pre-pro-polypeptide under permissive conditions; and 

said pre-pro-polypeptide is at least partially 
processed and secreted. 

5 25. A method according to Claim 24, wherein 

said mutant yeast is a temperature-sensitive sir 
mutant. 

26. A method for producing a polypeptide 
foreign to yeast and having such polypeptide secreted 

10 into the culture medium, said method comprising: 

growing yeast mutants containing an episomal 
expression element according to Claim 17, wherein said 
mutant permits external regulation of expression, 
whereby the encoding sequences are expressed to produce 
15 a pre-pro-polypeptide under permissive conditions; and 

said pre-pro-polypeptide is at least partially 
processed and secreted. 

27. A method according to Claim 26, wherein 
said mutant yeast is a temperature-sensitive sir 

20 mutant. 
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Description 

BACKGROUND OF THE INVENTION 
5 Held of the Invention 

Hybrid DNA technology has revolutionized the ability to produce polypeptides of an infinite variety of 
compositions. Since living forms are composed of proteins and employ proteins for regulation, the ability to 
duplicate these proteins at will offers unique opportunities for investigating the manner in which these 
ro proteins function and the use of such proteins, fragments of such proteins, or analogs in therapy and 
diagnosis. 

There have been numerous advances in improving the rate and amount of protein produced by a cell. 
Most of these advances have been associated with higher copy numbers, more efficient promoters, and 
means for reducing the amount of degradation of the desired product. Is is evident that it would be 
75 extremely desirable to be able to secrete polypeptides of interest where such polypeptides are the product 
of interest. 

Furthermore, in many situations, the polypeptide of interest does not have an initial methionine amino 
acid. This is usually a result of there being a processing signal in the gene encoding for the polypeptide of 
interest, which the gene source recognizes and cleaves with an appropriate peptidase. Since in most 

20 situations, genes of interest are heterologous to the host in which the gene is to be expressed, such 
processing occurs imprecisely and in low yield in the expression host. In this case, while the protein which 
is obtained will be identical to the peptide of interest for almost all of its sequence, it will differ at the N- 
terminus which can deleteriously affect physiological activity. 

There are, therefore, many reasons why it would be extremely advantageous to prepare DNA 

25 sequences, which would encode for the secretion and maturing of the polypeptide product. Furthermore, 
where sequences can be found for processing, which result in the removal of amino acids superfluous to 
the polypeptide of interest, the opportunity exists for having a plurality of DNA sequences, either the same 
or different, in tandem, which may be encoded on a single transcript. 

30 Description of the Prior Art 

U.S. Patent No. 4,336,336 describes for prokaryotes the use of a leader sequence coding for a non- 
cytoptasmic protein normally transported to or beyond the cell surface, resulting in transfer of the fused 
protein to the periplasmic space. U.S. Patent No. 4,338,397 describes for prokaryotes using a leader 

35 sequence which provides for secretion with cleavage of the leader sequence from the polypeptide 
sequence of interest. U.S. Patent No. 4,338,397, columns 3 and 4, provide for useful definitions, which 
definitions are incorporated herein by reference. 

Kurjan and Herskowttz, Cell (1982) 30:933-943 describes a putative a -factor precursor containing four 
tandem copies of mature a-factor, describing the sequence and postulating a processing mechanism. 

40 Kurjan and Herskowttz, Abstracts of Papers presented at the 1981 Cold Spring Harbor meeting on The 
Molecular Biology of Yeasts, page 242, in an Abstract entitled, "A Putative a-Factor Precursor Containing 
Four Tandem Repeats of Mature a-Factor," describe the sequence encoding for the a-factor and spacers 
between two of such sequences. Blair et al., Abstracts of Papers, ibid, page 243, in an Abstract entitled 
"Synthesis and Processing of Yeast Pheremones: Identification and Characterization of Mutants That 

45 Produce Altered o-Factors," describe the effect of various mutants on the production of mature a-factor. 

SUMMARY OF THE INVENTION 

The subject matter of the invention is defined in the claims. 

so Methods and compositions are provided for producing mature polypeptides. DNA constructs are 
provided which join the DNA fragments encoding for a yeast leader sequence and processing signal to 
heterologous genes for secretion and maturation of the polypeptide product. The construct of the DNA 
encoding for the N-terminal cleavable alpha-factor leader (or fragments or analogs thereof) and the DNA 
sequence encoding for the mature polypeptide product can be joined to appropriate vectors for introduction 

55 into yeast or other cell which recognizes the processing signals for production of the desired polypeptide. 
Other capabilities may also be introduced into the construct for various purposes. 

BRIEF DESCRIPTION OF THE DRAWINGS 
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Fig. 1 is a flow diagram indicating the construction of pYaEGF-21. 

Fig. 2 shows sequences at fusions of hEGF to the vector, a. through e. show the sequences at the N- 
terminal region of hEGF, which differ among several constructions and f. shows the C-terminal region of 
hEGF. 

5 

DESCRIPTION OF THE SPECIFIC EMBODIMENTS 

In accordance with the subject invention, eukaryotic hosts, particularly yeast are employed for the 
production of mature polypeptides, where such polypeptides may be harvested from a nutrient medium. 

io The polypeptides are produced by employing a DNA construct encoding for an alpha-factor leader (or 
fragments or analogs thereof) and processing signals joined to a polypeptide of interest, which may be a 
single polypeptide or a plurality of polypeptides separated by processing signals. The resulting construct 
encodes for a pre-pro-polypeptide which will contain the signals for secretion of the pre-pro-polypeptide and 
processing of the polypeptide, either intracellularly or extracellularly to the mature poypeptide. 

is According to a preferred embodiment of the invention, there is provided a DNA construct comprising a 
sequence comprising the formula: 

5'-Tr-L-SP-Gene # -Te-3' 

20 wherein: 

Tr is a yeast promoter sequence; 

L encodes at least a yeast alpha-factor leader sequence fragment that provides for secretion; 
Sp is a spacer sequence encoding processing signals for processing the precursor polypeptide 
encoded by L-Sp-Ge-ne" into the polypeptide encoded by Gene*; 
25 Gene* encodes a polypeptide foreign to yeast; and Te is a transcription termination sequence balanced 
with Tr. 

Furthermore, a DNA construct of the above formula is preferred wherein Sp contains or is composed of the 
sequence 5'-Ri -R2-3' immediately adjacent to the sequence Gene", Ri being a codon for lysine or arginine, 
R2 being a codon for arginine but does not encode a processing signal for dipeptidylaminopeptidase A. 
30 The constructs of the subject invention will for example have at least the following formula defining a 
propolypeptide: 

((R) r -(GAXYCX)„-Gene*)y 
35 wherein: 

R is CGX or AZZ, the codons coding for lysine and arginine, each of the Rs being the same or 
different; 

r is an integer of from 2 to 4, usually 2 to 3, preferably 2 or 4; 
X is any of the four nucleotides, T, G, C t or A; 
40 Y is G or C; 

y is an integer of at least one and usually not more than 10, more usually not more than four, providing 
for monomers and multimers; 
Z is A or G; and 

Gene* is a gene other than a-factor, usually foreign to a yeast host, usually a heterologous gene, 
45 desirably a plant or mammalian gene; 

n is 0 or an integer which will generally vary from 1 to 4, usually 2 to 3. 

The pro-polypeptide has an N-terminal processing signal for peptidase removal of the amino acids 
preceding the amino acids encoded for by Gene*. 

For the most part, the constructs of the subject invention will have at least the following formula: 

50 

MR-S^GAXYCX^Gene*^ 

defining a pre-pro-polypeptide, wherein all the symbols except L and S have been defined, S having 
the same definition as R t there being 1R and 1S. and L is an alpha-factor leader sequence (or fragment or 
55 analog thereof) providing for secretion of the pre-pro-polypeptide. While it is feasible to have more Rs and 
Ss there will usually be no advantage in the additional amino acids. Any alpha-factor leader sequence (or 
fragment or analog thereof) may be employed which provides for secretion, leader sequences generally 
being of about 30 to 120 amino acids, usually about 30 to 100 amino acids, having a hydrophobic region 
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and having a methionine at its N-terminus. 

The construct when n is 0 will have the following formula: 

L-((RV- Gene-)y 

5 

defining a pre-propolypeptide, wherein all the symbols have been defined previously, except r\ wherein: 
r' is 2 to 4, preferably 2 or 4. 

Of particular interest is the leader sequence of a-factor which is described in Kurjan and Herskowitz, 
supra , on page 937 or fragments or analogs thereof, which provide for efficient secretion of the desired 
io polypeptides. Furthermore, the DNA sequence indicated in the article, which sequence is incorporated 
herein by reference, is not essential, any sequence which encodes for the desired oligopeptide being 
sufficient. Different sequences will be more or less efficiently translated. 

While the above formulas are preferred, it should be understood, that with suppressor mutants, other 
sequences could be provided which would result in the desired function. Normally, suppressor mutants are 
75 not as efficient for expression and, therefore, the above indicated sequence or equivalent sequence 
encoding for the same amino acid sequence is preferred. To the extent that a mutant will express from a 
different codon the same amino acids which are expressed by the above sequence, then such alternative 
sequence could be permitted. 

The dipeptides which are encoded for by the sequence in the parenthesis will be an acidic amino acid, 
20 aspartic or glutamic, preferably glutamic, followed by a neutral amino acid, alanine and proline, particularly 
alanine. 

In providing for useful DNA sequences which can be used for cassettes for expression, the following 
sequence can be conveniently employed: 

25 Tr-L^R-SMGAXYCXy-W^Gene^ 
wherein: 

Tr intends a DNA sequence encoding for the transcriptional regulatory signals, particularly the promoter 
and such other regulatory signals as operators, activators, cap signal, signals enhancing ribosomal binding, 
30 or other sequence involved with transcriptional or translational control. The Tr sequence will generally be at 
least about lOObp and not more than about 2000bp. Particularly useful is employing the Tr sequence 
associated with the leader sequence L, so that a DNA fragment can be employed which includes the 
transcriptional and translational signal sequences associated with the leader sequence endogenous to the 
host. Alternatively, one may employ other transcriptional and translational signals to provide for enhanced 
35 production of the expression product; 

d is 0 or 1 , being 1 when y is greater than 1 ; 

n' is a whole number, generally ranging from 0 to 3, more usually being 0 or 2 to 3; 
r M is 1 or 2; 

W intends a terminal deoxyribosyl-3' group, or a DNA sequence which by itself or, when n* is other 
40 than 0, in combination with the nucleotides to which it is joined, W defines a restriction site, having either a 
cohesive end or butt end, wherein W may have from 0 to about 20 nucleotides in the longest chain; 
the remaining symbols having been defined previously. 
Of particular interest is the following construct: 

45 (Tr) a -L-(R-S) r „-(GAXYCX) n „GA!AGCf | 

wherein: 

all of the symbols previously defined have the same definition; 
so a is 0 or 1 intending that the construct may or may not have the transcriptional and translational signals; 

the nucleotides indicated in the broken box are intended not to be present but to be capable of addition 
by adding an Hind I II cleaved terminus to provide for the recreation of the sequence encoding for a 
di peptide; and 

n" will be 0 to 2, where at least one of the Xs and Ys defines a nucleotide, so that the sequence in the 
55 parenthesis is other than the sequence GAAGCT. 

The coding sequence of Gene* may be joined to the terminal T, providing that the coding sequence is 
in frame with the initiation codon and upon processing the first amino acid will be the correct amino acid for 
the mature polypeptide. 
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The 3Merminus of Gene" can be manipulated much more easily and, therefore, it is desirable to provide 
a construct which allows for insertion of Gene* into a unique restriction site in the construct. Such a 
construct would provide for a restriction site with insertion of the Gene" into the restriction site to be in 
frame with the initiation codon. Such a construction can be symbolized as follows: 

5 

(TrJa-MR-SV-tGAXYCX^-W-tSC^-Te 
wherein: 

those symbols previously defined have the same definition; 
10 SC are stop codons; 

Te is a termination sequence balanced with the promoter Tr, and may include other signals, e.g. 
polyadenylation; and 

b is an integer which will generally vary from about 0 to 4, more usually from 0 to 3, it being 
understood, that Gene* may include its own stop codons. 

75 Illustrative of a sequence having the above formula is where W is the sequence GA and n" is 2. 

Of particular interest is where the sequence encoding the terminal dipeptide is taken together with W to 
define a linker or connector, which allows for recreation of the terminal sequence defining the dipeptide of 
the processing signal and encodes for the initial amino acids of Gene*, so that the codons are in frame with 
the initiation codon of the leader. The linker provides for a staggered or butt ended termination, desirably 

20 defining a restriction site in conjunction with the successive sequences of the Gene*. Upon ligation of the 
linker with Gene*, the codons of Gene* will be in frame with the initiation codon of the leader. In this manner, 
one can employ a synthetic sequence which may be joined to a restriction site in the processing signal 
sequence to recreate the processing signal, while providing the initial bases of the Gene" encoding for the 
N-terminal amino acids. By employing a synthetic sequence, the synthetic linker can be a tailored 

25 connector having a convenient restriction site near the 3'-terminus and the synthetic connector will then 
provide for the necessary codons for the 5' -terminus of the gene. 

Alternatively, one could introduce a restriction endonuclease recognition site downstream from the 
processing signal to allow for cleavage and removal of superfluous bases to provide for ligation of the Gene* 
to the processing signal in frame with the initiation codon. Thus the first codon would encode for the N- 

30 terminal amino acid of the polypeptide. Where T is the first base of Gene*, one could introduce a restriction 
site where the recognition sequence is downstream from the cleavage site. For example, a Sau3A 
recognition sequence could be introduced immediately after the processing signal, which would allow for 
cleavage and linking of the Gene* with its initial codon in frame with the leader initiation codon. With 
restriction endonucteases which have the recognition sequence distal and downstream from the cleavage 

35 site e.g. Hga l, W could define such sequence which could include a portion of the processing signal 
sequences. Other constructions can also be employed, employing such techniques as primer repair and in 
vitro mutagenesis to provide for the convenient insertion of Gene* into the construct by introducing an 
appropriate restriction site. 

The construct provides a portable sequence for insertion into vectors, which provide the desired 

40 replication system. As already indicated, in some instances, it may be desirable to replace the wild type 
promoter associated with the leader sequence with a different promoter. In yeast, promoters involved with 
enzymes in the glycolytic pathway can provide for high rates of transcription. These promoters are 
associated with such enzymes as phosphoglucoisomerase, phosphofructokinase, phosphotriose isomerase, 
phosphogtucomutase, enolase, pyruvic kinase, glycerol deny de-3-phosphate dehydrogenase, and alcohol 

45 dehydrogenase. These promoters may be inserted upstream from the leader sequence. The 5* -flan king 
region to the leader sequence may be retained or replaced with the 3'-sequence of the alternative promoter. 
Vectors can be prepared and have been reported which include promoters having convenient restriction 
sites downstream from the promoter for insertion of such constructs as described above. 

The final construct wilt be an episomal element capable of stable maintenance in a host, particularly a 

so fungal host such as yeast. The construct will include one or more replication systems, desirably two 
replication systems, allowing for maintenance in the expression host and cloning in a prokaryote. In 
addition, one or more markers for selection will be included, which will allow for selective pressure for 
maintenance of the episomal element in the host. Furthermore, the episomal element may be a high or low 
copy number, the copy number generally ranging from about 1 to 200. With high copy number episomal 

55 elements, there will generally be at least 10, preferably at least 20, and usually not exceeding about 150, 
more usually not exceeding about 100 copy number. Depending upon the Gene*, either high or low copy 
numbers may be desirable, depending upon the effect of the episomal element on the host. Where the 
presence of the expression product of the episomal element may have a deleterious effect on the viability 
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of the host, a low copy number may be indicated. 

Various hosts may be employed, particularly mutants having desired properties. It should be appre- 
ciated that depending upon the rate of production of the expression product of the construct, the processing 
enzyme may or may not be adequate for processing at that level of production. Therefore, a mutant having 
5 enhanced production of the processing enzyme may be indicated or enhanced production of the enzyme 
may be provided by means of an episomal element. Generally, the production of the enzyme should be of 
a lower order than the production of the desired expression product. 

Where one is using a-factor for secretion and processing, it would be appropriate to provide for 
enhanced production of the processing enzyme Dipeptidyl Amino Peptidase A, which appears to be the 

70 expression product of STE13. This enzyme appears to be specific for X-Ala- and X-Pro-sequences, where X 
in this instance intends an amino acid, particularly , the dicarboxylic acid amino acids. 

Alternatively, there may be situations where intracellular processing is not desired. In this situation, it 
would be useful to have a stel3 mutant, where secretion occurs, but the product is not processed. In this 
manner, the product may be subsequently processed in vitro. 

75 Host mutants which provide for controlled regulations expression may be employed to advantage. For 
example, with the constructions of the subject invention where a fused protein is expressed, the transfor- 
mants have slow growth which appears to be a result of toxicity of the fused protein. Thus, by inhibiting 
expression during growth, the host may be grown to high density before changing the conditions to 
permissive conditions for expression. 

20 A temperature-sensitive sir mutant may be employed to achieve regulated expression. Mutation in any 
of the SIR genes results in anon-mating phenotype due to in situ expression of the normally silent MATa 
and MATa sequences present at the HML and HMR loci. 

Furthermore, as already indicated, the Gene* may have a plurality of sequences in tandem, either the 
same or different sequences, with intervening processing signals. In this manner, the product may be 

25 processed in whole or in part, with the result that one will obtain the various sequences either by 
themselves or in tandem for subsequent processing. In many situations, it may be desirable to provide for 
different sequences, where each of the sequences is a subunrt of a particular protein product. 

The Gene* may encode for any type of polypeptide of interest. The polypeptide may be as small as an 
oligopeptide of 8 amino acids or may be 100,000 daltons or higher. Usually, single chains will be less than 

30 about 300,000 daltons, more usually less than about 150,000 daltons. Of particular interest are polypeptides 
of from about 5,000 to 150,000 daltons, more particularly of about 5,000 to 100,000 daltons. Illustrative 
polypeptides of interest include hormones and factors, such as growth hormone, somatomedins epidermal 
growth factor, the endocrine secretions, such as luteinizing hormone, thyroid stimulating hormone, oxytocin, 
insulin, vasopressin, renin, calcitonin, follicle stimulating hormone, prolactin, etc.; hematopoietic factors, e.g. 

35 erythropoietin, colony stimulating factor, etc.; lymphokines; globins; globulins, e.g. immunoglobulins; al- 
bumins; interferons, such asaj and 7; repressors; enzymes; endorphins e.g. ^-endorphin, enkephalin, 
dynorphin, etc. 

Having prepared the episomal elements containing the constructs of this invention, one may then 
introduce such element into an appropriate host. The manner of introduction is conventional, there being a 
40 wide variety of ways to introduce DNA into a host. Conveniently, spheroplasts are prepared employing the 
procedure of, for example, Hinnen et al., PNAS USA (1978) 75:1919-1933 or Stinchcomb et al. t EP 0 045 
573 A2. The transform ants may then be grown in an appropriate nutrient medium and where - appropriate, 
maintaining selective pressure on the transformants. Where expression is inducible, one can allow for 
growth of the yeast to high density and then induce expression. In those situations, where a substantial 
45 proportion of the product may be retained in the periplasmic space, one can release the product by treating 
the yeast celts with an enzyme such as zymolase or lyticase. 

The product may be harvested by any convenient means, purifying the protein by chromatography, 
electrophoresis, dialysis, solvent-solvent extraction, etc. 

In accordance with the subject invention, one can provide for secretion of a wide variety of polypep- 
50 tides, so as to greatly enhance product yield, simplify purification, minimize degradation of the desired 
product, and simplify processing, equipment, and engineering requirements. Furthermore, utilization of 
nutrients based on productivity can be greatly enhanced, so that more economical and more efficient 
production of polypeptides may be achieved. Also, the use of yeast has many advantages in avoiding 
enterotoxins, which may be present with prokaryotes, and employing known techniques, which have been 
55 developed for yeast over long periods of time, which techniques include isolation of yeast products. 
The following examples are offered by way of illustration and not by way of limitation. 

EXPERIMENTAL 
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A synthetic sequence for human epidermal growth factor (EGF) based on the amino acid sequence of 
EGF reported by H. Gregory and B.M. Preston Int. J. Peptide Protein Res. 9, 107-118 (1977) was prepared, 
which had the following sequence. 

5 5 • AACTCCGACTCCGAATGTCCATTGTCCCACGACGGTTACTGTTTGCACGACGGTGTTTGT 

3 * TTGAGGCTGAGGCTTACAGGTAACAGGGTGCTGCCAATGACAAACGTGCTGCCACAAACA 

ATGTACATCGAAGCTTTGGACAAGTACGCTTGTAACTGTGTTGTTGGTTACATCGGTGM 
TACATGTAGCTTCGAAACCTGTTCATGCGAACATTGACACAACAACCAATGTAGCCACTT 

10 

AGATGTCAATACAGAGACTTGAAGTGGTGGGAATTGAGATGA 
TCT ACAGTTATGTCTCTGAACTTCAC CACCCTTAACTCTACT , 

;5 where 5' indicates the promoter proximal end of the sequence. The sequence was inserted into the Eco RI 
site of pBR328 to produce a plasmid p328EGF-1 and cloned. 

Approximately 30ug of p328EGF-1 was digested with EcoRI and approximately 1ug of the expected 
190 base pair EcoRI fragment was isolated. This was followed by digestion with the restriction enzyme 
Hgal. Two synthetic oligonucleotide connectors Hindlll -Hga l and Hgal-Sall were then ligated to the 159 base 

20 pair Hgal fragment. The Hga l-Hindlll linker had the following sequence: 

AGCTGAAGCT 
25 CTTCGATTGAG 

This linker restores the a-factor processing signals interrupted by the Hindlll digestion and joins the Hga l 
end at the 5'-end of the EGF gene to the Hindlll end of pAB1 12. 
The Hgal-Sall linker had the following sequence: 

30 

TGAGATGATAAG 

ACTATTCAGCT 

35 

This linker has two stop codons and joins the Hga l end at the 3'*end of the EGF gene to the Sail end of 
pAB112. 

The resulting 181 base pair fragment was purified by preparative gel electrophoresis and ligated to 
lOOng of pABH2 which had been previously completely digested with the enzymes Hindlll and Sai l. 

40 Surprisingly, a deletion occurred where the codon for the 3rd and 4th amino acids of EGF, asp and ser, 
were deleted, with the remainder of the EGF being retained. 

pABH2 is a plasmid containing a l.75kb EcoRI fragment with the yeast a-factor gene cloned in the 
EcoRI site of pBR322 in which the Hindlll and "Sail sites had been deleted (pABll). pABH2 was derived 
from plasmid pAB101 which contains the yeast a-factor gene as a partial Sau 3A fragment cloned in the 

45 Bam HI site of plasmid YEp24. pAB101 was obtained by screening a yeast genomic library in YEp24 using a 
synthetic 20-mer oligonucleotide probe (3'-GGCCGGTTGGTTACATGATT-5') homologous to the published 
a-factor coding region (Kurjan and Herskowitz, Abstracts 1981 Cold Spring Harbor meeting on the Molecular 
Biology of Yeasts, page 242). 

The resulting mixture was used to transform E. coli HB101 cells and plasmid pAB201 obtained. Plasmid 

so pAB201 (5ug) was digested to completion with""the~enzyme EcoRI and the resulting fragments were: a) 
filled in with DNA polymerase I Klenow fragment; b) ligated to an excess of Bam HI linkers; and c) digested 
with BamHI. The 1.75kbp EcoRI fragment was isolated by preparative gel electrophoresis and approxi- 
matety 100ng of the fragment was ligated to 100ng of pCl/1, which had been previously digested to 
completion with the restriction enzyme BamHI and treated with alkaline phosphatase. 

55 Plasmid pCl/1 is a derivative of pJDB219, Beggs, Nature (1978) 275:104, in which the region 
corresponding to bacterial plasmid pMB9 in pJDB2l9 has been replaced by pBR322 in pC1/1. This mixture 
was used to transform E. coli HB101 cells. Transformants were selected by ampicillin resistance and their 
plasmids analyzed by restriction endonucleases. DNA from one selected clone (pYEGF-8) was prepared 
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and used to transform yeast AB103 cells. Transformants were selected by their leu* phenotype. 

Fifty milliliter cultures of yeast strain AB103 (a, pep 4-3, leu 2*3. leu 2-112, ura 3-52, his 4 - 580 ) 
transformed with plasmid pYEGF-8 (deposited at the American type Culture Collection on 5th January 1983 
and given ATCC Accession no. 20658) were grown at 30* in -leu medium to saturation (optical density at 

5 600nm of 5) and left shaking at 30* for an additional 12 hr period. Cell supematants were collected by 
centrifugation and analyzed for the presence of human EGF using the fibroblast receptor competition 
binding assay. The assay of EGF is based on the ability of both mouse and human EGF to compete with 
125 Mabeled mouse EGF for binding sites on human foreskin fibroblasts. Standard curves can be obtained 
by measuring the effects of increasing quantities of EGF on the binding of a standard amount of 125 l-labeled 

10 mouse EGF. Under these conditions 2 to 20 ng of EGF are readily measurable. Details on the binding of 
125 l-labeled epidermal growth factor to human fibroblasts have been described by Carpenter et aJ., J. Biol. 
Chem . 250 , 4297 (1975). Using this assay it is found that the culture medium contains 7t1mg of human 
EGF per liter. 

For further characterization, human EGF present in the supernatant was purified by absorption to the 

15 ion-exchange resin Biorex-70 and elution with HCI lOmM in 80% ethanol. After evaporation of the HCI and 
ethanol the EGF was solubilized in water. This material migrates as a single major protein of MW approx. 
6,000 in 17.5% SDS gels, roughly the same as authentic mouse EGF (MW-6,000). This indicates that the 
o-factor leader sequence has been properly excised during the secretion process. Analysis by high 
resolution liquid chromatography (microbondapak C18, Waters column) indicates that the product migrates 

20 with a retention time similar to an authentic mouse EGF standard. However, protein sequencing by Edman 
degradation showed that the N-termtnus retained the glu-ala sequence. 

A number of other constructions were prepared using different constructions for joining hEGF to the a- 
factor secretory leader sequence, providing for different processing signals and site mutagenesis. In Fig.~2 
a. through e. show the sequence of the fusions at the N-terminal region of hEGF, which sequence differ 

25 among several constructions, f. shows the sequences at the C-terminal region of hEGF, which is the same 
for ail constructions. Synthetic oligonucleotide linkers used in these constructions are boxed. 

These fusions were made as follows. Construction (a) was made as described above. Construction (b) 
was made in a similar way except that linker 2 was used instead of linker 1. Linker 2 modifies the a-factor 
processing signal by inserting an additional processing site (ser-leu-asp-lys-arg) immediately preceding the 

30 hEGF gene. The resulting yeast plasmid is named pYaEGF-22. Construction (c), in which the dipeptidyl 
aminopeptidase maturation site (glu-ala) has been removed, was obtained by in vitro mutagenesis of 
construction (a). A Pstl-Sall fragment containing the a-factor leader-hEGF fusion was cloned in phage M1 3 
and isolated in a single-stranded form7 A synthetic 31-mer of sequence 5'- 
TCTTTGGATAAAAGAAACTCCGACTCCCG-3 , was synthesized and 70 picomoles were used as a primer 

35 for the synthesis of the second strand from 1 picomole of the above template by the Klenow fragment of 
DNA polymerase. After fill-in and ligation at 14* for 18 hrs, the mixture was treated with Si nuclease (5 
units for 15 min) and used to transfect E. coli JM101 cells. Bacteriophage containing DNA sequences in 
which the region coding for (glu-ala) wasTemoved were located by filter plaque hybridization using the ^P- 
labeled primer as probe. RF DNA from positive plaques was isolated, digested with Pstl and Sail and the 

40 resulting fragment inserted in pAB1 1 4 which had been previously digested to completion with" Sail and 
partially with Pstl and treated with alkaline phosphatase. 

The plasmid pAB1 14 was derived as follows: 
plasmid pABH2 was digested to completion with Hindlll and then reltgated at low (4ug/ml) DNA 
concentration and plasmid pAB113 was obtained in which" three 63bp Hindlll fragments have been deleted 

45 from the a-factor structural gene, leaving only a single copy of mature abactor coding region. A BamHI site 
was added to plasmid pAB1 1 by cleavage with EcoRI, filling in of the overhanging ends by the Klenow 
fragment of DNA polymerase, ligation of Bam HI linkers, cleavage with BamHI and re ligation to obtain 
PAB12. Plasmid pABH3 was digested with EcoRI, the overhanging ends filled in, and ligated to BamHI 
linkers. After digestion with Bam HI the 1500bp fragment was gel-purified and ligated to pAB12 which had 

so been digested with BamHI and treated with alkaline phosphatase. Plasmid pABH4, which contains a 
1500bp Bam HI fragment carrying the a-factor gene, was obtained. The resulting plasmid (pAB114 contain- 
ing the above described construct) is then digested with BamHI and ligated into plasmid pC1/1. 

The resulting yeast plasmid is named pYaEGF-23 and was deposited at the American Type Culture 
Collection on 12th August 1983 under ATCC Accession no. 40079. Construction (d), in which a new Kpnl 

55 site was generated, was made as described for construction (c) except that the 36-mer oligonucleotide 
primer of sequence 5'-GGGTACCTTTGGATAAAAGAAACTCCGACTCCGAAT-3' was used. The resulting 
yeast plasmid is named pYaEGF-24. Construction (e) was derived by digestion of the plasmid containing 
construction (d) with Kpnl and Sail instead of linker 1 and 2. The resulting yeast plasmid is named pYaEGF- 
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25. 

Yeast cells transformed with pYaEGF-22 were grown in 15 ml cultures. At the indicated densities or 
times, cultures were centrifuged and the supematants saved and kept on ice. The cell pellets were washed 
in lysis buffer (0.1 Triton X-100. 10mM NaHPO* pH 7.5) and broken by vortexing (5min in 1min intervals 

s with cooling on ice in between) in one volume of lysis buffer and one volume of glass beads. After 
centrifugation, the supematants were collected and kept on ice. The amount of hEGF in the culture medium 
and cell extracts was measured using the fibroblast receptor binding competition assay. Standard curves 
were obtained by measuring the effects of increasing quantities of mouse EGF on the binding of a standard 
amount 125 (-labeled mouse EGF. 

10 Proteins were concentrated from the culture media by absorption on Bio-Rex 70 resin and elution with 
0.01 HCI in 80% ethanol and purified by high performance liquid chromatography (HPLC) on a reverse 
phase C18 column. The column was eluted at a flow rate of 4ml/min with a linear gradient of 5% to 80% 
acetonitrile containing 0.2% trifluoroacetic acid in 60min. Proteins (200-800 picomotes) were sequenced at 
the amino-terminaJ end by the Edman degradation method using a gas-phase protein sequencer Applied 

75 Biosystems model 470A. The normal PROTFA program was used for all the analyses. Dithiothreitol was 
added to S2 (ethyl acetate: 20mg/liter) and S3 (butyl chloride: lOmg/liter) immediately before use. All 
samples were treated with 1N HCI in methanol at 40* for 15min to convert PTH-aspartic acid and PTH- 
glutamic acid to their methyl esters. All PTH-amino acid identifications were performed by reference to 
retention times on a IBM CN HPLC column using a known mixture of PTH-amino acids as standards. 

20 Secretion from pYaEGF-22 gave a 4:1 mole ratio of native N-terminus hEGF to glu-ala terminated 
hEGF, while secretion from pYaEGF-23-25 gave only native N-terminated hEGF. Yields of hEGF ranged 
from 5 to 8ug/ml measured either as protein or in a receptor binding assay. 

The strain JRY188 (MAT sir3-8 leu2-3 leu2- 112 trpl ura3 his4 rme) was transformed with pYaEGF-21 
and leucine prototrophs selected" at 37*. "Saturated cultures were then diluted 1/100 in fresh medium and 

25 grown in leucine selective medium at permissive (24 * ) and non-permissive (36 * ) temperatures and culture 
supematants were assayed for the presence of hEGF as described above. The results are shown in the 
following table. 



Regulated synthesis and secretion of hEGF in transformed 
yeast sir3 temperature- sensitive mutants. 



Temperature Trans form ant O.D.650 hEGF(pg/ml) 



40 



45 



SO 



36° 3a 


3.5 


0.010 




5.4 


0.026 


3b 


3.6 


0.020 




6.4 


0.024 


24° 3a 


0.4 


34 




1.3 


145 




2.1 


1075 




4.0 


3250 


3b 


0.4 


32 




1.4 


210 




2.2 


1935 




4.2 


4600 



55 

These results indicate that the hybrid a-factor/EGF gene is being expressed under mating type 
regulation, even though it is present on a high copy number plasmid. 

In accordance with the subject invention, novel constructs are provided which may be inserted into 
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vectors to provide for expression of polypeptides having an N-terminaJ leader sequence and one or more 
processing signals to provide for secretion of the polypeptide as well as processing to result in a mature 
polypeptide product free of superfluous amino acids. Thus, one can obtain a polypeptide having the 
identical sequence to a naturally occurring polypeptide. In addition, because the polypeptide can be 

5 produced in yeast, glycosylation can occur, so that products can be obtained which are identical to the 
naturally occurring products. Furthermore, because the product is secreted, greatly enhanced yields can be 
obtained based on cell population and processing and purification are greatly simplified. In addition, 
employing mutant hosts, expression can be regulated to be turned off or on, as desired. 

Although the foregoing invention has been described in some detail by way of illustration and example 

io for purposes of clarity of understanding, it wit) be obvious that certain changes and modifications may be 
practiced within the scope of the appended claims. 

Claims 

Claims for the following Contracting States : BE, CH, DE, DK, ES, FR, GB, GR, IT, U, Ul, NL, SE 

15 

1. A DNA construct encoding a protein foreign to yeast, the amino acid sequence of said protein 
comprising at least a yeast alpha-factor leader sequence fragment that provides for secretion linked to 
a heterologous polypeptide sequence, said protein also containing yeast processing signals between 
said alpha-factor leader sequence fragment and said heterologous polypeptide for processing said 

20 protein into said heterologous polypeptide. 

2. A DNA construct according to Claim 1 further comprising a yeast promoter at the 5* end. 

3. A DNA construct according to any of Claims 1 - 2 wherein said heterologous polypeptide is a 
25 mammalian protein. 

4. A DNA construct comprising a sequence comprising the formula: 
5'-Tr-L-Sp-Gene*-Te-3' 

30 

wherein: 

Tr is a yeast promoter sequence; 

L encodes at least a yeast alpha-factor leader sequence fragment that provides for secretion; 
Sp is a spacer sequence encoding processing signals for processing the precursor polypeptide 
35 encoded by L-Sp-Gene* into the polypeptide encoded by Gene*; 

Gene* encodes a polypeptide foreign to yeast; and Te is a transcription termination sequence 
balanced with Tr. 

5. The DNA construct of Claim 4 wherein Tr comprises a yeast alpha-factor promoter sequence. 

40 

6. The DNA construct according to either of Claims 4 - 5 wherein Sp contains the sequence 5 f -Ri-R2-3' 
immediately adjacent to the sequence Gene*. Ri being a codon for lysine or arginine, R2 being a codon 
for arginine but does not encode a processing signal for dipeptidylaminopeptidase A. 

45 7. A DNA construct according to Claim 6 wherein Sp is 5'-Ri -R2-3'. 

& A DNA construct according to Claim 4 wherein Gene* encodes a mammalian protein. 

9. A DNA construct according to Claim 5 wherein Gene* encodes a mammalian protein. 

50 

10. A DNA construct according to Claim 6 wherein Gene* encodes a mammalian protein. 

11. A DNA construct according to Claim 8 wherein said mammalian protein is human epidermal growth 
factor. 

55 

12. A DNA construct according to Claim 9 wherein said mammalian protein is human epidermal growth 
factor. 
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13. A DNA construct according to Claim 10 wherein said mammalian protein is human epidermal growth 
factor. 

14. An episomal expression element comprising a DNA construct according to Claim 8 and a replication 
5 system providing stable maintenance in a yeast host. 

15. An episomal expression element comprising a DNA construct according to Claim 9 and a replication 
system providing stable maintenance in a yeast host. 

to 16. An episomal expression element comprising a DNA construct according to Claim 10 and a replication 
system providing stable maintenance in a yeast host. 

17. A method for producing a recombinant polypeptide in yeast and having said polypeptide secreted into 
the culture medium, said method comprising: 

75 providing a yeast host transformed by a DNA construct according to Claim 4; 

growing in said culture medium said transformed yeast under conditions whereby the precursor 
polypeptide encoded by 5*-L-Sp-Gene*-3* is expressed, at least partially processed into a polypeptide 
having the sequence encoded by Gene*, and secreted into said culture medium; 
and recovering from said culture medium said secreted polypeptide. 

20 

18. A method according to Claim 17 wherein Tr comprises a yeast alpha-factor promoter sequence. 

19. A method according to either Claim 17 or 18 wherein Sp contains the sequence 5*-Ri-R2-3' imme- 
diately adjacent to the sequence Gene*, Ri being a codon for lysine or arginine, Fb being a codon for 

25 arginine, but does not encode a processing signal for dipeptidylaminopeptidase A. 

20. A method according to Claim 19 wherein S is 5*-Ri-R2-3\ 

21. A method according to any of Claims 17-20 wherein Gene" encodes a mammalian polypeptide. 

30 

22. A method according to Claim 21 wherein said mammalian polypeptide is epidermal growth factor. 

23. A method according to any of Claims 17-20 wherein said yeast is strain AB 103, ATCC No. 20 658. 
35 24. Plasmid pYEGF8, ATCC Accession No. 20658. 

25. Plasmid pYaEGF 23., ATCC Accession No. 40079 

26. A method for producing a recombinant polypeptide comprising: 

40 providing a yeast host transformed by a DNA construct encoding a protein foreign to yeast, the 

amino acid sequence of said protein comprising at least a yeast alpha-factor leader sequence fragment 
that provides for secretion linked to a heterologous polypeptide sequence, said protein also containing 
yeast processing signals between said alpha-factor leader sequence fragment and said heterologous 
polypeptide for processing said protein into said heterologous polypeptide; 

45 growing in said culture medium said transformed yeast host under conditions whereby said protein 

foreign to yeast is expressed, at least partially processed into said heterologous polypeptide, and 
secreted into said culture medium; 

and recovering from said culture medium said secreted heterologous polypeptide. 

so 27. A method according to Claim 26 wherein said heterologous polypeptide is a mammalian protein. 

28. A method according to Claim 26 wherein said yeast alpha-factor is Saccharomyces alpha-factor. 

29. A method according to claim 26 wherein said heterologous polypeptide is processed intracellularly or 
55 extracellularly to provide a mature polypeptide. 

30. A method according to claim 29 wherein said mature polypeptide is selected from the group consisting 
of growth hormone, somatomedins, epidermal growth factor, insulin, renin, calcitonin, albumins and 
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interferons. 

31. A method according to any of claims 17, 19 or 21 wherein the polypeptide encoded by Gene" is 
processed intracellularly or extracellularly to provide a mature polypeptide. 

32. A method according to claim 31 wherein said mature polypeptide is selected from the group consisting 
of growth hormone, somatomedins, epidermal growth factor, insulin, renin, calcitonin, albumins and 
interferons. 

33. A method according to any of claims 26 to 32 wherein said yeast is strain AB 103, ATCC. No. 20658. 

34. A host cell transformed by a DNA construct according to any of claims 1, 4, 8 or 7. 
Claims for the following Contracting State : AT 

1. A method for producing a recombinant polypeptide 
comprising: 

providing a yeast host transformed by a DNA construct encoding a protein foreign to yeast, the 
amino acid sequence of said protein comprising at least a yeast alpha-factor leader sequence fragment 
that provides for secretion linked to a heterologous polypeptide sequence, said protein also containing 
yeast processing signals between said alpha-factor leader sequence fragment and said heterologous 
polypeptide for processing said protein into said heterologous polypeptide; 

growing in said culture medium said transformed yeast host under conditions whereby said protein 
foreign to yeast is expressed, at least partially processed into said heterologous polypeptide, and 
secreted into said culture medium; 

and recovering from said culture medium said secreted heterologous polypeptide. 

2. A method according to Claim 1 wherein said heterologous polypeptide is a mammalian protein. 

X A method according to any of Claims 1-2 wherein said DNA construct comprises a sequence 
comprising the formula: 

S'-Tr-L-Sp-Gene'-Te-S' 

wherein: 

- Tr is a yeast promoter sequence; 

- L encodes said yeast alpha-factor leader sequence fragment; 

- Sp is a spacer sequence encoding said yeast processing signals for processing the precursor 
polypeptide encoded by L-Sp-Gene* into the polypeptide encoded by Gene"; 

- Gene" encodes said heterologous polypeptide; and 

- Te is a transcription termination sequence balanced with Tr. 

4. A method according to Claim 3 wherein Sp contains the sequence 5'-Ri -rV3' immediately adjacent to 
the sequence Gene", Ri being a codon for lysine or arginine, R2 being a codon for arginine but does 
not encode a processing signal for dipeptidylaminopeptidase A. 



5. 


A 


method 


according 


to 


Claim 


4 wherein 


Sp is 5 v -Ri-Re-3'. 


6. 


A 


method 


according 


to 


Claim 


3 wherein 


Tr comprises a yeast alpha-factor promoter sequence. 


7. 


A 


method 


according 


to 


Claim 


1 wherein 


said yeast alpha-factor is Saccharomyces alpha-factor. 


& 


A 


method 


according 


to 


Claim 


1 wherein 


said yeast alpha-factor is S. cerevisiae alpha-factor. 


9. 


A 


method 


according 


to 


Claim 


2 wherein 


said mammalian protein is human epidermal growth factor. 


10. 


A 


method 


according 


to 


Claim 


4 wherein 


Gene" encodes human epidermal growth factor. 
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11. A method according to Claim 5 wherein Gene* encodes human epidermal growth factor. 

12. A method according to Claim 7 wherein said heterologous polypeptide comprises human epidermal 
growth factor. 

5 

13. A method according to Claim 8 wherein said heterologous polypeptide comprises human epidermal 
growth factor. 

14. A method according to any of claims 1, 7, or 8 wherein said heterologous polypeptide is processed 
ro intracellular^ or extracellularly to provide a mature polypeptide. 

15. A method according to claim 4 wherein said heterologous polypeptide is processed intracellular^ or 
extracellularly to provide a mature polypeptide. 

75 16. A method according to claim 15 wherein said mature polypeptide is selected from the group consisting 
of growth hormone, somatomedins, epidermal growth factor, insulin, renin, calcitonin, albumins and 
interferons. 

, 17. A method according to claim 16 wherein said mature polypeptide is selected from the group consisting 
20 of growth hormone, somatomedins, epidermal growth factor, insulin, renin, calcitonin, albumins and 
interferons. 

18. A method according to any of Claims 1, 7, 8 and 15 to 17 wherein said yeast is strain AB 103, ATCC 
No. 20 658. 

25 

Revendlcatlons 

Revendications pour les Etats contractant sulvants : BE, CH, DE, DK, ES, FR, GB, GR, IT, LI, LU, NL, 
SE 

30 1. Construction d'ADN codant pour une proline 6trang&re & une levure, la sequence d'acides amines de 
cette proline comprenant au moins un fragment de sequence de tete de facteur alpha de levure qui 
pourvoit k la s6cr£tion lid & une sequence de polypeptide h6t£rologue, cette proline contenant aussi 
des signaux de maturation molSculaire de levure entre ce fragment de sequence de tete de facteur 
alpha et ce polypeptide h£t£rologue pour la maturation molSculaire de cette proline dans ce 

35 polypeptide hdtdrologue. 

2. Construction d'ADN suivant la revendication 1, comprenant de plus un promoteur de levure a 
rextn$mit6 5'. 

40 3. Construction d'ADN suivant la revendication 1 ou la revendication 2, dans laquelle ce polypeptide 
h6t£rologue est une proline de mammifere. 

4. Construction d'ADN comprenant une sequence comprenant la formula : 

45 5'-Tr-L-Sp-G&ne*-Te-3' 

dans laquelle : 

Tr est une sequence promoteur de levure; 

L code au moins pour un fragment de sequence de tete de facteur alpha de levure qui pourvoit 
so k la s6cr6tion; 

Sp est une sequence espaceur codant pour des signaux de maturation moldculaire pour la 

maturation mol6cutaire du polypeptide prdcurseur cod§ par L-Sp-G6ne* en polypeptide codS 

par G&ne"; 

Gfene" code pour un polypeptide Stranger a une levure; et 
55 Te est une sequence de terminaison §quilibr6e par Tr. 

5. Construction d'ADN suivant la revendication 4, dans laquelle Tr comprend une sequence promoteur de 
facteur alpha de levure. 
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6. Construction d'ADN suivant la revendication 4 ou la revendication 5, dans laquefle Sp contient la 
sequence S'-Ri^-S' immediatement adjacent© a la sequence Gene*, Ri etant un codon pour la lysine 
ou Parginine. R2 etant un codon pour I'arginine, mais qui ne code pas pour un signal de maturation 
moleculaire pour la dipeptidylaminopeptidase A. 

7. Construction d'ADN suivant la revendication 6. dans laquelle Sp est 5'-Ri-R2-3\ 

a Construction d'ADN suivant la revendication 4, dans laquelle 6ene* code pour une proline de 
mammifere. 



9. Construction d'ADN suivant la revendication 5 t dans laquelle Gene* code pour une proline de 
mammifere. 



10. Construction d'ADN suivant la revendication 6 t dans laquelle Gene" code pour une proline de 
mammifere. 



11. Construction d'ADN suivant la revendication 8. dans laquelle cette proline de mammifere est un 
facteur de croissance epidermique humain. 

12. Construction, d'ADN suivant la revendication 9, dans laquelle cette proline de mammifere est un 
facteur de croissance epidermique humain. 

13. Construction d'ADN suivant la revendication 10, dans laquelle cette proline de mammifere est un 
facteur de croissance epidermique humain. 

14. Element depression episomique comprenant une construction d'ADN suivant la revendication 8 et un 
systeme de replication pourvoyant a un maintien stable dans une Ievure-h6te. 

15. Element d'expression Episomique comprenant une construction d'ADN suivant la revendication 9 et un 
systeme de replication pourvoyant a un maintien stable dans une levure-hdte. 

16. Element d'expression episomial comprenant une construction d'ADN suivant la revendication 10 et un 
systeme de replication pourvoyant a un maintien stable dans une Ievure-h8te. 

17. Methode pour produire un polypeptide recombinant dans une levure et avoir une secretion de ce 
polypeptide dans le milieu de culture, cette methode comprenant 

la fourniture d'une levure-hdte transformed par une construction d'ADN suivant la revendication 4; 

la croissance dans ce milieu de culture de cette levure transformee dans des conditions dans 
lesquelles le polypeptide precurseur code par 5'-L-Sp-Gene*-3' est exprime, soumis au moins partielle- 
ment a une maturation moteculaire en un polypeptide ayant la sequence codee par Gene* et s6crete 
dans ce milieu de culture; 

et la recuperation a partir de ce milieu de culture de ce polypeptide secrete. 

1a Methode suivant la revendication 17. dans laquelle Tr comprend une sequence promoteur de facteur 
alpha de levure. 

19. Methode suivant la revendication 17 ou la revendication 18. dans laquelle Sp contient la sequence 5'- 
Ri-Ra-3' imm6diatement adjacente a la sequence Gene*, R, etant un codon pour la lysine ou I'arginine, 
Rz etant un codon pour I'arginine, mais qui ne code pas pour un signal de maturation moleculaire pour 
la dipeptidylaminopeptidase A. 

20. Methode suivant la revendication 19, dans laquelle Sp est 5'-Ri-R2-3\ 

21. Methode suivant I'une quelconque des revendications 17 a 20, dans laquelle Gene* code pour un 
polypeptide de mammifere. 

22. Methode suivant la revendication 21, dans laquelle ce polypeptide de mammifere est un facteur de 
croissance epidermique. 
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23* Methode suivant Tune quelconque des revendications 17 a 20, dans laquelle cette levure est la souche 
AB 103, ATCC No. 20 658. 

24. Plasmide pYEGFS, No d'acces ATCC No. 20 658. 

5 

25. Plasmide pYaEGF23 f No d'acces ATCC No. 40 079. 

26. Methode pour prod u ire un polypeptide recombinant comprenant 

la fourniture d'une levure-hdte transformee par une construction d'ADN codant pour une proline 
10 etrangere a la levure, la sequence d'acides amines de cette proline comprenant au moins un fragment 
de sequence de t§te de facteur alpha de levure qui pourvort a la secretion lie a une sequence de 
polypeptide necrologue, cette proline contenant aussi des signaux de maturation moieculaire de 
levure entre ce fragment de sequence de t§te de facteur alpha et ce polypeptide necrologue pour la 
maturation moieculaire de cette proline en ce polypeptide neCrologue; 
is la croissance dans ce milieu de culture de cette levure-hote transformee dans des conditions dans 

lesquelles cette proline etrangere a la levure est exprimee, soumise au moins partiellement a une 
maturation moieculaire en ce polypeptide heterologue et s&rgtee dans ce milieu de culture; 
et la recuperation a partir de ce milieu de culture de ce polypeptide necrologue secr6te\ 

20 27. Methode suivant la revendication 26, dans laquelle ce polypeptide necrologue est une proline de 
mammifere. 

2a Methode suivant la revendication 26, dans laquelle ce facteur alpha de levure est le facteur alpha de 
Saccharomyces. 

25 

29. Method© suivant la revendication 26, dans laquelle ce polypeptide heterologue est soumis a une 
maturation moieculaire de fagon intracellulaire ou extracellulaire pour foumir un polypeptide mature. 

30. Methode suivant la revendication 29, dans laquelle ce polypeptide mature est choisi dans te groupe 
30 consistant en hormone de croissance, somatomSdines, facteur de croissance epidermique, insuiine, 

rgnine, calcitonine, aJbumines et interferons. 

31. Methode suivant Tune quelconque des revendications 17, 19 ou 21, dans laquelle le polypeptide code 
par Gene" est soumis a une maturation moieculaire de fagon intracellulaire ou extracellulaire pour 

35 foumir un polypeptide mature. 

32. Methode suivant la revendication 31, dans laquelle ce polypeptide mature est choisi dans le groupe 
consistant en hormone de croissance, somatomedines, facteur de croissance epidermique, insuiine, 
renins, calcitonine, aJbumines et interferons. 

40 

33. Methode suivant Tune quelconque des revendications 26 a 32, dans laquelle cette levure est la souche 
AB 103. ATCC No. 20 658. 

34. Cellule-h8te transformee par une construction d'ADN suivant Tune quelconque des revendications 1 , 4, 
4$ 6 ou 7. 

Revendications pour I'Etat contractant suivant : AT 

1. Methode pour produire un polypeptide recombinant comprenant: 

so la fourniture d'une Ievure-h8te transformee par une construction d'ADN codant pour une proteine 

etrangere a la levure, la sequence d'acides amines de cette proteine comprenant au moins un fragment 
de sequence de t§te de facteur alpha de levure qui pourvoit a la secretion lie a une sequence de 
polypeptide heterologue, cette proteine contenant aussi des signaux de maturation moieculaire de 
levure entre ce fragment de sequence de t§te de facteur alpha et ce polypeptide heterologue pour la 

55 maturation moieculaire de cette proteine en ce polypeptide heterologue; 

la croissance dans ce milieu de culture de cette Ievure-h6te transformee dans des conditions dans 
lesquelles cette proteine etrangere a la levure est exprimee, soumise au moins partiellement a une 
maturation moieculaire en ce polypeptide heterologue et secretee dans ce milieu de culture; 
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et la recuperation a partir de ce milieu de culture de ce polypeptide h6t6rologue s6crdt6. 

2. M6thode suivant ta revendication 1, dans laquelle ce polypeptide h£t6rologue est une proline de 
mammrfere. 

3. Mdthode suivant la revendication 1 ou la revendication 2, dans laquelle cette construction d'ADN 
comprend une sequence comprenant la formule : 

S'-Tr-L-Sp-G&ne'-Te-T 

dans laquelle : 

Tr est une sequence promoteur de levure; 

L code pour ce fragment de sequence de t§te de facteur alpha de levure; 

Sp est une sequence espaceur codant pour ces signaux de maturation motdculatre pour la 

maturation moteculaire du polypeptide pr6curseur cod6 par L-Sp-G6ne" en polypeptide cod6 

par Gdne"; 

G&ne* code pour ce polypeptide h&grologue; et 

Te est une sequence de terminaison 6quilibr6e par Tr. 

4. Mdthode suivant la revendication 3, dans laquelle Sp contient la sequence 5'-Ri-R2-3' imm&Jiatement 
adjacente a la sequence Gfcne*. Ri dtant un codon pour la lysine ou rarginine, Fb 6tant un codon pour 
rarginine, mais qui ne code pas pour un signal de maturation mo!6culaire pour la dipeptidylaminopepti- 
dase A. 

5. Mdthode suivant la revendication 4, dans laquelle Sp est 5'-Ri-R2-3'. 

6. M6thode suivant la revendication 3. dans laquelle Tr comprend une sequence promoteur de facteur 
alpha de levure. 

7. M6thode suivant la revendication 1, dans laquelle ce facteur alpha de levure est un facteur alpha de 
Saccharomyces. 

8. Mithode suivant la revendication 1 , dans laquelle ce facteur alpha de levure est un facteur alpha de S. 
cerevisiae. 

9. M6thode suivant la revendication 2, dans laquelle cette proline de mammifere est un facteur de 
croissance dpidermique humain. 

10. M6thode suivant la revendication 4, dans laquelle G&ne* code pour un facteur de croissance 6pidermi- 
que humain. 

11. M6thode suivant la revendication 5, dans laquelle G&ne* code pour un facteur de croissance 6pidermi- 
que humain. 

12. M6thode suivant la revendication 7, dans laquelle ce polypeptide h6t£rologue comprend un facteur de 
croissance 6pidermique humain. 

13. M6thode suivant la revendication 8, dans laquelle ce polypeptide h£t6rologue comprend un facteur de 
croissance 6pidermique humain. 

14. M6thode suivant Tune quelconque des revendications 1, 7 ou 8, dans laquelle ce polypeptide 
h6t£rologue est sou mis a une maturation mol£culaire de fag on intracellular ou extracellulaire pour 
foumir un polypeptide mature. 

15. MSthode suivant la revendication 4, dans laquelle ce polypeptide h6t6rologue est soumis a une 
maturation mol£culaire de fag on intracellulaire ou extracellulaire pour fournir un polypeptide mature. 

16. Mgthode suivant la revendication 15, dans laquelle ce polypeptide mature est choisi dans le groupe 
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consistant en hormone de croissance, somatom§dines. facteur de croissance e*pidermique. insuline, 
renins, calcitonine, albumines et interferons. 

17. Melhode suivant la revendication 16. dans laquelle ce polypeptide mature est choist dans le groupe 
5 consistant en hormone de croissance, somatome'dines, facteur de croissance gpidermique, insuline, 

rgnine, calcitonine, albumines et interferons. 

18. M^thode suivant Tune quelconque des revendications 1, 7, 8 et 15 a 17, dans laquelle cette levure est 
la souche AB 103, ATCC No. 20 658. 

TO 

Patentansprtlche 

Patentansprtiche fUr folgende Vertragsstaaten : BE, CH, DE, DK, ES, FR, GB, GR, IT, U, LU, NL, SE 

I. DNA-Konstrukt, das ein zu Hefe fremdes Protein kodiert, wobei die AminosSuresequenz des Proteins 
75 zumindest ein die Sekretion ermSglichendes Hefe-a-Faktor-Leader-Sequenz-Fragment. verknUpft mit 

einer heterologen Polypeptidsequenz, umfaflt und das Protein zum Processing des Proteins in das 
heterologe Polypeptid weiterhin Hefe-Processing-Signale zwischen dem a-Faktor-Leader-Sequenz-Frag- 
ment und dem heterologen Polypeptid enthalt 

20 2. DNA-Konstrukt gemSB Anspruch 1 , das weiterhin an dem S'-Ende einen Hefe-Promotor umfaflt. 

3b DNA-Konstrukt gem36 einem jeden der AnsprQche 1 bis 2, bei dem das heterologe Polypeptid ein 
Saugetierprotein ist. 

25 4. DNA-Konstrukt, das eine Sequenz umfafit, die die Formel: 

5'-Tr-L-Sp-Gene*-Te-3' 

umfaJ3t, in der: 

30 

Tr eine Hefe-Promotor-Sequenz ist; 

L zumindest ein die Sekretion ermSglichendes Hefe-or-Faktor-Leader-Sequenz-Fragment kodiert; 

35 Sp eine Spacersequenz ist, die Processing-Signale zum Processing des durch L-Sp-Gene* kodierten 
Vorlaufer-Polypeptids in das durch Gene* kodierte Polypeptid kodiert; 

Gene* ein zu Hefe fremdes Polypeptid kodiert; und 

40 Te eine mit Tr balancierte Terminations-Sequenz ist 

5. DNA-Konstrukt nach Anspruch 4, bei dem Tr eine Hefe-a-Faktor-Promotor-Sequenz umfaBt 

6. DNA-Konstrukt gemafl einem der AnsprUche 4 bis 5, bei dem Sp die Sequenz 5*-Ri-fV3' unmittelbar 
45 neben der Sequenz Gene* enthSIt, wobei Ri ein Kodon fUr Lysin oder Arginin und Fb ein Kodon fUr 

Arginin ist, aber nicht ein Processing-Signal fUr Dipeptidylaminopeptidase A kodiert. 

7. DNA-Konstrukt gemSfl Anspruch 6, bei dem Sp 5'-Ri-R2-3' ist. 

so 8. DNA-Konstrukt gemSfi Anspruch 4, bei dem Gene* ein SSugetierprotein kodiert. 

9. DNA-Konstrukt gernae Anspruch 5, bei dem Gene* ein Saugetierprotein kodiert. 

10. DNA-Konstrukt gemafi Anspruch 6, bei dem Gene" ein Saugetierprotein kodiert. 

55 

II. DNA-Konstrukt gemafi Anspruch 8, bei dem das Saugetierprotein ein humaner epidermaler Wachs- 
tumsfaktor ist. 
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12. DNA-Konstrukt gemfifl Anspruch 9, bei dem das SSugetierprotein ein humaner epidermaier Wachs- 
tumsfaktor ist 

13b DNA-Konstrukt gemJifl Anspruch 10, bei dem das SSugetierprotein ein humaner epidermaier Wachs- 
5 tumsfaktor ist. 

14. Episomales Expressionselement, das ein DNA-Konstrukt gemSfi Anspruch 8 und ein ein stabiles 
Erhaltenbleiben in einem Hefewirt erm&glichendes Replikationssytem umfaflt. 

w 15. Episomales Expressionselement, das ein DNA-Konstrukt gemSB Anspruch 9 und ein ein stabiles 
Erhaltenbleiben in einem Hefewirt ermttglichendes Replikationssytem umfaflt. 

16. Episomales Expressionselement, das ein DNA-Konstrukt gem££ Anspruch 10 und ein ein stabiles 
Erhaltenbleiben in einem Hefewirt ermdglichendes Replikationssytem umfaflt. 

17. Verfahren zur Herstellung eines rekombinanten Polypeptids in Hefe und zum Erhalt der Sekretion des 
Polypeptids in das Kulturmedium, bei dem: 

ein mit einem DNA-Konstrukt gemSfl Anspruch 4 transformierter Hefewirt bereitgestellt wird; 
20 die transformierte Hefe in dem Kulturmedium unter Bedingungen gezUchtet wird, bet denen das durch 
5'-L-Sp-Gene"-3' kodierte VorlSufer-Polypeptid exprimiert, mindestens teilweise in ein die von Gene* 
kodierte Sequenz aufweisendes Polypeptid weiterverarbeitet und in das Kulturmedium sekretiert wird; 

und das sekretierte Polypeptid aus dem Kulturmedium wiedergewonnen wird. 

25 

1a Verfahren gemSfl Anspruch 17, bet dem Tr eine Hefe^c-Faktor-Promotor-Sequenz umfaflt. 

19. Verfahren gemSfl einem der AnsprUche 17 oder 18. bei dem Sp die Sequenz 5*-Ri-R2-3* unmittelbar 
neben der Sequenz Gene* enMIt, wobei Ri ein Kodon fOr Lysin oder Arginin und R2 ein Kodon fOr 

30 Arginin ist, aber nicht ein Processing-Signal fUr Dipeptidylaminopeptidase A kodiert. 

20. Verfahren gemSfl Anspruch 19, bei dem S 5'-Ri-R2-3' ist. 

21. Verfahren gemSfl einem jeden der AnsprUche 17 bis 20, bei dem Gene* ein SSugetierpolypeptid 
35 kodiert. 

22. Verfahren gemSfl Anspruch 21, bei dem das Saugetierpolypeptid ein epidermaier Wachstumsfaktor ist. 

23. Verfahren nach einem jeden der AnsprUche 17 bis 20, bei dem die Hefe der Stamm AB 103. ATCONr. 
40 20 658, ist 

24. Plasmid pYEGF8, ATCC-Zugangs-Nr. 20658. 

25. Plasmid pYaEGF 23., ATCC-Zugangs-Nr. 40079. 

45 

26. Verfahren zur Herstellung eines rekombinanten Polypeptids, mit den Schritlen: 

Bereitstellen eines Hefewirts, der mit einem DNA-Konstrukt, das ein zu Hefe fremdes Protein kodiert, 
transformiert ist, wobei die Aminosauresequenz des Proteins zumindest ein die Sekretion ermoglichen- 
so des Hefe-a-Faktor-Leader- Sequenz- Fragment verknUpft mit einer heterologen Polypeptidsequenz um- 
faflt, und das Protein zum Processing des Proteins in das heterologe Polypeptid weiterhin Hefe- 
Processing-Signale zwischen dem ar-Faktor-Leader-Sequenz-Fragment und dem heterologen Polypeptid 
enthalt; 

55 ZUchten des transformierten Hefewirts in dem Kulturmedium unter Bedingungen, bei denen das zu 
Hefe fremde Protein exprimiert. mindestens teilweise zu dem heterologen Polypeptid weiterverarbeitet 
und in das Kulturmedium sekretiert wird; 
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und ZurUckgewinnen des heterologen Polypeptids aus dem Kurturmedium. 
27. Verfahren gemfifl Anspruch 26, bei dem das heterologe Polypeptid ein SSugetierprotein ist. 
2& Verfahren gemSB Anspruch 26, bei dem der Hefe-a-Faktor der Saccharomyces-o-Faktor ist. 

29. Verfahren gemafl Anspruch 26, bei dem das heterologe Polypeptid zum Erhalt eines reifen Polypeptids 
intracellular Oder extrazellular weiterverarbeitet wird. 

30. Verfahren gemSB Anspruch 29, bei dem das reife Polypeptid aus der von dem Wachstumshormon, 
Somatomedinen, dem epidermalen Wachstumsfaktor, Insulin, Renin, Calcitonin, Albuminen und Interfe- 
ronen gebildeten Qruppe ausgewahlt ist. 

31. Verfahren gem&fl einem jeden der AnsprQche 17, 19 oder 21, bei dem das von Gene* kodierte 
Polypeptid zum Erhalt eines reifen Polypeptids intracellular oder extrazellular weiterverarbeitet wird. 

32. Verfahren gemafl Anspruch 31, bei dem das reife Polypeptid aus der von dem Wachstumshormon, 
Somatomedinen, dem epidermalen Wachstumsfaktor, Insulin, Renin, Calcitonin, Albuminen und Interfe- 
ronen gebildeten Qruppe ausgewShrt ist 

33. Verfahren gemSfi einem jeden der AnsprUche 26 bis 32, bei dem die Hefe der Stamm AB 103, ATCC- 
Nr. 20658, ist. 

34. Wirtszelle, die mit einem DNA-Konstrukt gemafl einem jeden der AnsprUche 1, 4, 6 Oder 7 transformiert 
ist. 

Patentansprtlche fUr folgenden Vertragsstaat : AT 

1. Verfahren zur Herstellung eines rekombinanten Polypeptids, mit den Schritten: 

Bereitstellen eines Hefewirts, der mit einem DNA-Konstrukt. das ein zu Hefe fremdes Protein kodiert, 
transformiert ist, wobei die Aminosauresequenz des Proteins zumindest ein die Sekretion ermSglichen- 
des Hefe-a-Faktor- Leader-Sequenz-Fragment verknOpft mit einer heterologen Polypeptidsequenz urn- 
faflt, und das Protein zum Processing des Proteins in das heterologe Polypeptid weiterhin Hefe- 
Processing-Signale zwischen dem a-Faktor-Leader-Sequenz-Fragment und dem heterologen Polypeptid 
enMIt; 

ZOchten des transformierten Hefewirts in dem Kurturmedium unter Bedingungen, bei denen das zu 
Hefe fremde Protein exprimiert. mindestens tetiweise zu dem heterologen Polypeptid weiterverarbeitet 
und in das Kurturmedium sekretiert wird; 

und ZurUckgewinnen des heterologen Polypeptids aus dem Kutturmedium. 

2 Verfahren gemafl Anspruch 1, bei dem das heterologe Polypeptid ein SSugetierprotein ist. 

3. Verfahren gemafl einem jeden der AnsprUche 1 bis 2, bei dem das DNA-Konstrukt eine Sequenz 
umfaflt, die die Form el: 

S'-Tr-L-Sp-Gene'-Te-S' 
umfaflt, in der: 

Tr eine Hefe-Promotor-Sequenz ist; 

L zumindest ein die Sekretion ermQglichendes Hefe-a-Faktor-Leader-Sequenz-Fragment kodiert; 

Sp eine Spacersequenz ist, die die Processing-Signale zum Processing des durch L-Sp-Gene" 
kodierten Vorl&ufer-Polypeptids in das durch Gene* kodierte Polypeptid kodiert; 
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Gene* das heterologe Polypeptid kodiert; und 

Te eine mit Tr balancierte Terminations-Sequenz ist 

s 

4. Verfahren gemSfl Anspruch 3, bei dem Sp die Sequenz 5'-Ri-R2-3' unmittelbar neben der Sequenz 
Gene* enthSIt, Ri ein Kodon fOr Lysin Oder Arginin und Ffe ein Kodon fUr Arginin ist, aber nicht ein 
Processing-Signal fQr Dipeptidylaminopeptidase A kodiert. 

70 5. Verfahren gemSfl Anspruch 4, bei dem Sp 5'-Ri -Fb-3' ist. 

6. Verfahren gemSfl Anspruch 3, bei dem Tr eine Hefe-a-Faktor-Promotor-Sequenz einschlieBt. 

7. Verfahren gem£B Anspruch 1 , bei dem der Hefe-a-Faktor ein Saccharomyces-o-Faktor ist. 

75 

& Verfahren gemMB Anspruch 1 , bei dem der Hefe-a-Faktor der S. cerevisiae-a-Faktor ist. 

9. Verfahren gemSfl Anspruch 2, bei dem das Sflugetierprotein ein humaner epidermaler Wachstumsfak- 
tor ist. 

20 

10. Verfahren gemdfi Anspruch 4, bei dem Gene* einen humanen epidermalen Wachstumsfaktor kodiert. 

11. Verfahren gemSB Anspruch 5, bei dem Gene* einen humanen epidermalen Wachstumsfaktor kodiert. 

25 12. Verfahren gemSfl Anspruch 7, bei dem das heterologe Polypeptid einen humanen epidermalen 
Wachstumsfaktor einschlieBt 

13. Verfahren gem&fl Anspruch 8, bei dem das heterologe Polypeptid einen humanen epidermalen 
Wachstumsfaktor einschlieBt. 

30 

14. Verfahren gemSB einem jeden der Anspruche 1, 7 oder 8, bei dem das heterologe Polypeptid zum 
Erhalt eines reifen Polypeptids intracellular Oder extrazellular weiterverarbeitet wird. 

15. Verfahren gem£B Anspruch 4, bei dem das heterologe Polypeptid zum Erhalt eines reifen Polypeptids 
35 intracellular Oder extracellular weiterverarbeitet wird. 

16. Verfahren gemaB Anspruch 15. bei dem das reife Polypeptid aus der von dem Wachstumshormon, 
Somatomedinen, dem epidermalen Wachstumsfaktor, Insulin, Renin. Calcitonin, Albuminen und Interfe- 
ronen gebildeten Gruppe ausgewahlt ist. 

40 

17. Verfahren gemSfl Anspruch 16, bei dem das reife Polypeptid aus von dem Wachstumshormon, 
Somatomedinen, dem epidermalen Wachstumsfaktor, Insulin, Renin. Calcitonin, Albuminen und Interfe- 
ronen gebildeten Gruppe ausgewahlt ist. 

45 1fc Verfahren gemSfl einem jeden der AnsprOche 1, 7, 8 und 15 bis 17, bei dem die Hefe der Stamm AB 
103, ATCC-Nr. 20658, ist. 



50 
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